HOME AGENT BASED CACHE TRANSFER ACCELERATION SCHEME

    公开(公告)号:US20210064545A1

    公开(公告)日:2021-03-04

    申请号:US17019999

    申请日:2020-09-14

    Abstract: Systems, apparatuses, and methods for implementing a speculative probe mechanism are disclosed. A system includes at least multiple processing nodes, a probe filter, and a coherent slave. The coherent slave includes an early probe cache to cache recent lookups to the probe filter. The early probe cache includes entries for regions of memory, wherein a region includes a plurality of cache lines. The coherent slave performs parallel lookups to the probe filter and the early probe cache responsive to receiving a memory request. An early probe is sent to a first processing node responsive to determining that a lookup to the early probe cache hits on a first entry identifying the first processing node as an owner of a first region targeted by the memory request and responsive to determining that a confidence indicator of the first entry is greater than a threshold.

    Cache control aware memory controller

    公开(公告)号:US10572389B2

    公开(公告)日:2020-02-25

    申请号:US15839700

    申请日:2017-12-12

    Abstract: Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. External system memory is used as a last-level cache and includes one of a variety of types of dynamic random access memory (DRAM). A memory controller generates a tag request and a separate data request based on a same, single received memory request. The sending of the tag request is prioritized over sending the data request. A partial tag comparison is performed during processing of the tag request. If a tag miss is detected for the partial tag comparison, then the data request is cancelled, and the memory request is sent to main memory. If one or more tag hits are detected for the partial tag comparison, then processing of the data request is dependent upon the result of the full tag comparison.

    COHERENT BLOCK READ FULFILLMENT
    13.
    发明公开

    公开(公告)号:US20240202144A1

    公开(公告)日:2024-06-20

    申请号:US18410554

    申请日:2024-01-11

    Abstract: A coherent memory fabric includes a plurality of coherent master controllers and a coherent slave controller. The plurality of coherent master controllers each include a response data buffer. The coherent slave controller is coupled to the plurality of coherent master controllers. The coherent slave controller, responsive to determining a selected coherent block read command is guaranteed to have only one data response, sends a target request globally ordered message to the selected coherent master controller and transmits responsive data. The selected coherent master controller, responsive to receiving the target request globally ordered message, blocks any coherent probes to an address associated with the selected coherent block read command until receipt of the responsive data is acknowledged by a requesting client.

    Scalable region-based directory
    17.
    发明授权

    公开(公告)号:US11507517B2

    公开(公告)日:2022-11-22

    申请号:US17033212

    申请日:2020-09-25

    Abstract: Disclosed is a cache directory including one or more cache directories configurable to interchange within each cache directory entry at least one bit between a first field and a second field to change the size of the region of memory represented and the number of cache lines tracked in the cache subsystem.

    Region based split-directory scheme to adapt to large cache sizes

    公开(公告)号:US10705959B2

    公开(公告)日:2020-07-07

    申请号:US16119438

    申请日:2018-08-31

    Abstract: Systems, apparatuses, and methods for maintaining region-based cache directories split between node and memory are disclosed. The system with multiple processing nodes includes cache directories split between the nodes and memory to help manage cache coherency among the nodes' cache subsystems. In order to reduce the number of entries in the cache directories, the cache directories track coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Each processing node includes a node-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the node. The node-based cache directory includes a reference count field in each entry to track the aggregate number of cache lines that are cached per region. The memory-based cache directory includes entries for regions which have an entry stored in any node-based cache directory of the system.

    TAG ACCELERATOR FOR LOW LATENCY DRAM CACHE
    20.
    发明申请

    公开(公告)号:US20190196974A1

    公开(公告)日:2019-06-27

    申请号:US15855838

    申请日:2017-12-27

    Abstract: Systems, apparatuses, and methods for implementing a tag accelerator cache are disclosed. A system includes at least a data cache and a control unit coupled to the data cache via a memory controller. The control unit includes a tag accelerator cache (TAC) for caching tag blocks fetched from the data cache. The data cache is organized such that multiple tags are retrieved in a single access. This allows hiding the tag latency penalty for future accesses to neighboring tags and improves cache bandwidth. When a tag block is fetched from the data cache, the tag block is cached in the TAC. Memory requests received by the control unit first lookup the TAC before being forwarded to the data cache. Due to the presence of spatial locality in applications, the TAC can filter out a large percentage of tag accesses to the data cache, resulting in latency and bandwidth savings.

Patent Agency Ranking