Apparatus and method for multi-level cache request tracking

    公开(公告)号:US10310978B2

    公开(公告)日:2019-06-04

    申请号:US15721499

    申请日:2017-09-29

    Abstract: An apparatus and method for multi-level cache request tracking. For example, one embodiment of a processor comprises: one or more cores to execute instructions and process data; a memory subsystem comprising a system memory and a multi-level cache hierarchy; a primary tracker to store a first entry associated with a memory request to transfer a cache line from the system memory or a first cache within the cache hierarchy to a second cache; primary tracker allocation circuitry to allocate and deallocate entries within the primary tracker; a secondary tracker to store a second entry associated with the memory request; secondary tracker allocation circuitry to allocate and deallocate entries within the secondary tracker; the primary tracker allocation circuitry to deallocate the first entry in response to a first indication that one or more cache coherence requirements associated with the cache line have been resolved, the secondary tracker allocation circuitry to deallocate the second entry in response to a second indication related to transmission of the cache line to the second cache.

    ADDRESS RANGE PRIORITY MECHANISM
    25.
    发明申请
    ADDRESS RANGE PRIORITY MECHANISM 审中-公开
    地址范围优先机制

    公开(公告)号:US20170010974A1

    公开(公告)日:2017-01-12

    申请号:US15275630

    申请日:2016-09-26

    Abstract: Method and apparatus to efficiently manage data in caches. Data in caches may be managed based on priorities assigned to the data. Data may be requested by a process using a virtual address of the data. The requested data may be assigned a priority by a component in a computer system called an address range priority assigner (ARP). The ARP may assign a particular priority to the requested data if the virtual address of the requested data is within a particular range of virtual addresses. The particular priority assigned may be high priority and the particular range of virtual addresses may be smaller than a cache's capacity.

    Abstract translation: 有效管理缓存中的数据的方法和设备。 高速缓存中的数据可以基于分配给数据的优先级来管理。 数据可以由使用数据的虚拟地址的进程请求。 请求的数据可以被称为地址范围优先级分配器(ARP)的计算机系统中的组件分配优先级。 如果请求的数据的虚拟地址在虚拟地址的特定范围内,则ARP可以向所请求的数据分配特定优先级。 分配的特定优先级可以是高优先级,并且虚拟地址的特定范围可以小于高速缓存的容量。

    Programmable address range engine for larger region sizes

    公开(公告)号:US11989135B2

    公开(公告)日:2024-05-21

    申请号:US16786815

    申请日:2020-02-10

    CPC classification number: G06F12/1027 G06F2212/657

    Abstract: Examples described herein relate to a computing system supporting custom page sized ranges for an application to map contiguous memory regions instead of many smaller sized pages. An application can request a custom range size. An operating system can allocate a contiguous physical memory region to a virtual address range by specifying a custom range sizes that are larger or smaller than the normal general page sizes. Virtual-to-physical address translation can occur using an address range circuitry and translation lookaside buffer in parallel. The address range circuitry can determine if a custom entry is available to use to identify a physical address translation for the virtual address. Physical address translation can be performed by transforming the virtual address in some examples.

    Spatial and temporal merging of remote atomic operations

    公开(公告)号:US11500636B2

    公开(公告)日:2022-11-15

    申请号:US16799619

    申请日:2020-02-24

    Abstract: Disclosed embodiments relate to spatial and temporal merging of remote atomic operations. In one example, a system includes an RAO instruction queue stored in a memory and having entries grouped by destination cache line, each entry to enqueue an RAO instruction including an opcode, a destination identifier, and source data, optimization circuitry to receive an incoming RAO instruction, scan the RAO instruction queue to detect a matching enqueued RAO instruction identifying a same destination cache line as the incoming RAO instruction, the optimization circuitry further to, responsive to no matching enqueued RAO instruction being detected, enqueue the incoming RAO instruction; and, responsive to a matching enqueued RAO instruction being detected, determine whether the incoming and matching RAO instructions have a same opcode to non-overlapping cache line elements, and, if so, spatially combine the incoming and matching RAO instructions by enqueuing both RAO instructions in a same group of cache line queue entries at different offsets.

    ADAPTIVE REMOTE ATOMICS
    29.
    发明申请

    公开(公告)号:US20220206945A1

    公开(公告)日:2022-06-30

    申请号:US17134254

    申请日:2020-12-25

    Abstract: Disclosed embodiments relate to atomic memory operations. In one example, an apparatus includes multiple processor cores, a cache hierarchy, a local execution unit, and a remote execution unit, and an adaptive remote atomic operation unit. The cache hierarchy includes a local cache at a first level and a shared cache at a second level. The local execution unit is to perform an atomic operation at the first level if the local cache is a storing a cache line including data for the atomic operation. The remote execution unit is to perform the atomic operation at the second level. The adaptive remote atomic operation unit is to determine whether to perform the first atomic operation at the first level or at the second level and whether to copy the cache line from the shared cache to the local cache.

    Remote atomic operations in multi-socket systems

    公开(公告)号:US11138112B2

    公开(公告)日:2021-10-05

    申请号:US16382092

    申请日:2019-04-11

    Abstract: Disclosed embodiments relate to remote atomic operations (RAO) in multi-socket systems. In one example, a method, performed by a cache control circuit of a requester socket, includes: receiving the RAO instruction from the requester CPU core, determining a home agent in a home socket for the addressed cache line, providing a request for ownership (RFO) of the addressed cache line to the home agent, waiting for the home agent to either invalidate and retrieve a latest copy of the addressed cache line from a cache, or to fetch the addressed cache line from memory, receiving an acknowledgement and the addressed cache line, executing the RAO instruction on the received cache line atomically, subsequently receiving multiple local RAO instructions to the addressed cache line from one or more requester CPU cores, and executing the multiple local RAO instructions on the received cache line independently of the home agent.

Patent Agency Ranking