CACHE ENTRY REPLACEMENT BASED ON AVAILABILITY OF ENTRIES AT ANOTHER CACHE
    1.
    发明申请
    CACHE ENTRY REPLACEMENT BASED ON AVAILABILITY OF ENTRIES AT ANOTHER CACHE 审中-公开
    基于另一个缓存中的可用性的缓存条目替换

    公开(公告)号:WO2017218022A1

    公开(公告)日:2017-12-21

    申请号:PCT/US2016/051661

    申请日:2016-09-14

    Abstract: A processing system [100] selects entries for eviction at one cache [130] based at least in part on the validity status of corresponding entries at a different cache [140]. The processing system includes a memory hierarchy having at least two caches, a higher level cache [140] and a lower level cache [130]. The lower level cache monitors which locations of the higher level cache have been indicated as invalid and, when selecting an entry of the lower level cache for eviction to the higher level cache, selects the entry based at least in part on whether the selected cache entry will be stored at an invalid cache line of the higher level cache.

    Abstract translation: 至少部分地基于不同高速缓存[140]处的对应条目的有效性状态,处理系统[100]在一个高速缓存[130]处选择用于驱逐的条目。 处理系统包括具有至少两个高速缓存的存储器层级,较高级别高速缓存[140]和较低级别高速缓存[130]。 较低级高速缓存监视较高级高速缓存的哪些位置已经被指示为无效,并且当为较高级高速缓存驱逐而选择较低级高速缓存的条目时,至少部分地基于所选高速缓存条目 将存储在较高级别缓存的无效缓存行中。

    QUALITY OF SERVICE DIRTY LINE TRACKING
    2.
    发明申请

    公开(公告)号:WO2021046229A1

    公开(公告)日:2021-03-11

    申请号:PCT/US2020/049215

    申请日:2020-09-03

    Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.

    SELECTING CACHE TRANSFER POLICY FOR PREFETCHED DATA BASED ON CACHE TEST REGIONS
    3.
    发明申请
    SELECTING CACHE TRANSFER POLICY FOR PREFETCHED DATA BASED ON CACHE TEST REGIONS 审中-公开
    基于缓存测试区域选择缓存数据的缓存传输策略

    公开(公告)号:WO2018017671A1

    公开(公告)日:2018-01-25

    申请号:PCT/US2017/042794

    申请日:2017-07-19

    Abstract: A processor [101] applies a transfer policy [111, 112] to a portion [118] of a cache [110] based on access metrics for different test regions [115, 116] of the cache, wherein each test region applies a different transfer policy for data in cache entries that were stored in response to a prefetch requests but were not the subject of demand requests. One test region applies a transfer policy under which unused prefetches are transferred to a higher level cache in a cache hierarchy upon eviction from the test region of the cache. The other test region applies a transfer policy under which unused prefetches are replaced without being transferred to a higher level cache (or are transferred to the higher level cache but stored as invalid data) upon eviction from the test region of the cache.

    Abstract translation: 处理器[101]基于针对高速缓存的不同测试区域[115,116]的访问度量将传输策略[111,112]应用于高速缓存[110]的一部分[118] ,其中每个测试区域对缓存条目中的数据应用不同的传输策略,所述缓存条目响应于预取请求而被存储,但不是请求请求的主题。 一个测试区域应用传输策略,在该传输策略下,将未使用的预取从高速缓存的测试区域逐出后,传输到高速缓存分层结构中的更高级别高速缓存。 另一个测试区域应用传输策略,在该传输策略下,在从缓存的测试区域逐出时,未使用的预取被替换而不被传输到更高级别的缓存(或者被传输到更高级别的缓存但被存储为无效数据)。

    SCALED SET DUELING FOR CACHE REPLACEMENT POLICIES
    4.
    发明申请
    SCALED SET DUELING FOR CACHE REPLACEMENT POLICIES 审中-公开
    用于高速缓存替换策略的SCALED SET DUELING

    公开(公告)号:WO2017218026A1

    公开(公告)日:2017-12-21

    申请号:PCT/US2016/052605

    申请日:2016-09-20

    Abstract: A processing system (100, 300) includes a cache (300) that includes cache lines (315) that are partitioned into a first subset (320) of the cache lines and second subsets (320) of the cache lines. The processing system also includes one or more counters (330) that are associated with the second subsets of the cache lines. The processing system further includes a processor (305) configured to modify the one or more counters in response to a cache hit or a cache miss associated with the second subsets. The one or more counters are modified by an amount determined by one or more characteristics of a memory access request that generated the cache hit or the cache miss.

    Abstract translation: 处理系统(100,300)包括高速缓存(300),该高速缓存(300)包括高速缓存行(315),高速缓存行被划分为高速缓存行的第一子集(320)和第二子集 缓存行。 处理系统还包括与高速缓存行的第二子集相关联的一个或多个计数器(330)。 处理系统还包括处理器(305),其被配置为响应于与第二子集相关联的高速缓存命中或高速缓存未命中而修改一个或多个计数器。 一个或多个计数器被修改的量由产生高速缓存命中或高速缓存未命中的存储器访问请求的一个或多个特征确定。

    THROTTLING WHILE MANAGING UPSTREAM RESOURCES

    公开(公告)号:WO2021062197A1

    公开(公告)日:2021-04-01

    申请号:PCT/US2020/052786

    申请日:2020-09-25

    Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.

    PROBE INTERRUPT DELIVERY
    6.
    发明申请

    公开(公告)号:WO2020040874A1

    公开(公告)日:2020-02-27

    申请号:PCT/US2019/039288

    申请日:2019-06-26

    Abstract: Systems, apparatuses, and methods for routing interrupts on a coherency probe network are disclosed. A computing system includes a plurality of processing nodes, a coherency probe network, and one or more control units. The coherency probe network carries coherency probe messages between coherent agents. Interrupts that are detected by a control unit are converted into messages that are compatible with coherency probe messages and then routed to a target destination via the coherency probe network. Interrupts are generated with a first encoding while coherency probe messages have a second encoding. Cache subsystems determine whether a message received via the coherency probe network is an interrupt message or a coherency probe message based on an encoding embedded in the received message. Interrupt messages are routed to interrupt controller(s) while coherency probe messages are processed in accordance with a coherence probe action field embedded in the message.

    SCHEDULING INDEPENDENT AND DEPENDENT OPERATIONS FOR PROCESSING
    7.
    发明申请
    SCHEDULING INDEPENDENT AND DEPENDENT OPERATIONS FOR PROCESSING 审中-公开
    调度独立和相关的加工操作

    公开(公告)号:WO2018017457A1

    公开(公告)日:2018-01-25

    申请号:PCT/US2017/042333

    申请日:2017-07-17

    CPC classification number: G06F12/0877 G06F9/3836 G06F9/3855 G06F2212/60

    Abstract: A processor (100) includes an operations scheduler (105) to schedule execution of operations at, for example, a set of execution units (110) or a cache of the processor. The operations scheduler periodically adds sets of operations to a tracking array (120), and further identifies when an operation in the tracked set is blocked from execution scheduling in response to, for example, identifying that the operation is dependent on another operation that has not completed execution. The processor further includes a counter (130) that is adjusted each time an operation in the tracking array is blocked from execution, and is reset each time an operation in the tracking array is executed. When the value of the counter exceeds a threshold (135), the operations scheduler prioritizes the remaining tracked operations for execution scheduling.

    Abstract translation: 处理器(100)包括操作调度器(105),以调度在例如一组执行单元(110)或处理器的缓存中的操作的执行。 操作调度器周期性地将一组操作添加到跟踪阵列(120),并且进一步标识响应于例如识别出该操作依赖于尚未执行调度的另一操作的被跟踪集合中的操作被阻止执行调度 完成执行。 处理器进一步包括计数器(130),每当跟踪阵列中的操作被阻止执行时,计数器(130)被调整,并且每当执行跟踪阵列中的操作时被重置。 当计数器的值超过阈值(135)时,操作调度器优先执行剩余的跟踪操作以执行调度。

    SYSTEM AND METHOD FOR STORING CACHE LOCATION INFORMATION FOR CACHE ENTRY TRANSFER
    8.
    发明申请
    SYSTEM AND METHOD FOR STORING CACHE LOCATION INFORMATION FOR CACHE ENTRY TRANSFER 审中-公开
    用于存储高速缓存入口传输的高速缓存位置信息的系统和方法

    公开(公告)号:WO2018013824A1

    公开(公告)日:2018-01-18

    申请号:PCT/US2017/041956

    申请日:2017-07-13

    Abstract: A cache [120] stores, along with data [170] that is being transferred from a higher level cache [140] to a lower level cache, information [171] indicating the higher level cache location from which the data was transferred. Upon receiving a request for data that is stored at the location in the higher level cache, a cache controller[130] stores the higher level cache location information in a status tag of the data. The cache controller then transfers the data with the status tag indicating the higher level cache location to a lower level cache. When the data is subsequently updated or evicted from the lower level cache, the cache controller reads the status tag location information and transfers the data back to the location in the higher level cache from which it was originally transferred.

    Abstract translation: 高速缓存[120]连同正在从较高级高速缓存[140]传送到较低级高速缓存的数据[170]一起存储指示高级高速缓存位置的信息[171] 数据传输到哪里。 在接收到存储在较高级高速缓存中的位置处的数据的请求时,高速缓存控制器[130]将较高级高速缓存位置信息存储在数据的状态标签中。 缓存控制器然后将具有指示较高级缓存位置的状态标签的数据传输到较低级缓存。 当数据随后从较低级别的高速缓存更新或逐出时,高速缓存控制器读取状态标签位置信息并将数据传回到它最初从其传输的较高级别高速缓存中的位置。

    OLDEST OPERATION WAIT TIME INDICATION INPUT INTO SET-DUELING

    公开(公告)号:WO2021046217A1

    公开(公告)日:2021-03-11

    申请号:PCT/US2020/049197

    申请日:2020-09-03

    Abstract: Systems, apparatuses, and methods for dynamically adjusting cache policies to reduce execution core wait time are disclosed. A processor includes a cache subsystem. The cache subsystem includes one or more cache levels and one or more cache controllers. A cache controller partitions a cache level into two test portions and a remainder portion. The cache controller applies a first policy to the first test portion and applies a second policy to the second test portion. The cache controller determines the amount of time the execution core spends waiting on accesses to the first and second test portions. If the measured wait time is less for the first test portion than for the second test portion, then the cache controller applies the first policy to the remainder portion. Otherwise, the cache controller applies the second policy to the remainder portion.

    SYSTEM AND METHOD FOR IDENTIFYING PENDENCY OF A MEMORY ACCESS REQUEST AT A CACHE ENTRY
    10.
    发明申请
    SYSTEM AND METHOD FOR IDENTIFYING PENDENCY OF A MEMORY ACCESS REQUEST AT A CACHE ENTRY 审中-公开
    系统和方法用于识别缓存条目中的存储器访问请求的缓冲

    公开(公告)号:WO2018013813A1

    公开(公告)日:2018-01-18

    申请号:PCT/US2017/041935

    申请日:2017-07-13

    Abstract: A processing system [100] indicates the pendency of a memory access request [102] for data at the cache entry that is assigned to store the data in response to the memory access request. While executing instructions, the processor issues requests for data to the cache [140] most proximal to the processor. In response to a cache miss, the cache controller identifies an entry [245] of the cache to store the data in response to the memory access request, and stores an indication [147] that the memory access request is pending at the identified cache entry. If the cache controller receives a subsequent memory access request for the data while the memory access request is pending at the higher level of the memory hierarchy, the cache controller identifies that the memory access request is pending based on the indicator stored at the entry.

    Abstract translation: 处理系统[100]指示缓存条目处的数据的存储器访问请求[102]的悬而未决,缓存条目被分配用于响应于存储器访问请求而存储数据。 在执行指令时,处理器将数据请求发送到最接近处理器的高速缓存[140]。 响应于高速缓存未命中,高速缓存控制器响应于存储器访问请求标识高速缓存的条目[245]以存储数据,并且存储指示[147]:存储器访问请求在所标识的高速缓存条目处挂起 。 如果高速缓存控制器在存储器层级的较高级别处的存储器访问请求未决时接收到针对数据的后续存储器访问请求,则高速缓存控制器基于存储在该条目处的指示符来标识存储器访问请求未决。 / p>

Patent Agency Ranking