-
公开(公告)号:US20210064545A1
公开(公告)日:2021-03-04
申请号:US17019999
申请日:2020-09-14
Applicant: Advanced Micro Devices, Inc.
Inventor: Amit P. Apte , Ganesh Balakrishnan , Vydhyanathan Kalyanasundharam , Kevin M. Lepak
IPC: G06F12/128 , G06F12/0817 , G06F12/0831 , G06F12/0891
Abstract: Systems, apparatuses, and methods for implementing a speculative probe mechanism are disclosed. A system includes at least multiple processing nodes, a probe filter, and a coherent slave. The coherent slave includes an early probe cache to cache recent lookups to the probe filter. The early probe cache includes entries for regions of memory, wherein a region includes a plurality of cache lines. The coherent slave performs parallel lookups to the probe filter and the early probe cache responsive to receiving a memory request. An early probe is sent to a first processing node responsive to determining that a lookup to the early probe cache hits on a first entry identifying the first processing node as an owner of a first region targeted by the memory request and responsive to determining that a confidence indicator of the first entry is greater than a threshold.
-
公开(公告)号:US10572389B2
公开(公告)日:2020-02-25
申请号:US15839700
申请日:2017-12-12
Applicant: Advanced Micro Devices, Inc.
Inventor: Ravindra N. Bhargava , Ganesh Balakrishnan
IPC: G06F12/0895 , G06F12/0897
Abstract: Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. External system memory is used as a last-level cache and includes one of a variety of types of dynamic random access memory (DRAM). A memory controller generates a tag request and a separate data request based on a same, single received memory request. The sending of the tag request is prioritized over sending the data request. A partial tag comparison is performed during processing of the tag request. If a tag miss is detected for the partial tag comparison, then the data request is cancelled, and the memory request is sent to main memory. If one or more tag hits are detected for the partial tag comparison, then processing of the data request is dependent upon the result of the full tag comparison.
-
公开(公告)号:US20240202144A1
公开(公告)日:2024-06-20
申请号:US18410554
申请日:2024-01-11
Applicant: Advanced Micro Devices, Inc.
Inventor: Vydhyanathan Kalyanasundharam , Amit P. Apte , Eric Christopher Morton , Ganesh Balakrishnan , Ann M. Ling
CPC classification number: G06F13/1673 , G06F3/061 , G06F3/0656 , G06F3/0658 , G06F3/0679 , G06F2213/0038
Abstract: A coherent memory fabric includes a plurality of coherent master controllers and a coherent slave controller. The plurality of coherent master controllers each include a response data buffer. The coherent slave controller is coupled to the plurality of coherent master controllers. The coherent slave controller, responsive to determining a selected coherent block read command is guaranteed to have only one data response, sends a target request globally ordered message to the selected coherent master controller and transmits responsive data. The selected coherent master controller, responsive to receiving the target request globally ordered message, blocks any coherent probes to an address associated with the selected coherent block read command until receipt of the responsive data is acknowledged by a requesting client.
-
公开(公告)号:US20240111683A1
公开(公告)日:2024-04-04
申请号:US17958179
申请日:2022-09-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Amit Apte , Ganesh Balakrishnan
IPC: G06F12/0895
CPC classification number: G06F12/0895 , G06F2212/60
Abstract: A method includes, in a cache directory, storing a set of entries corresponding to one or more memory regions having a first region size when the cache directory is in a first configuration, and based on a workload sparsity metric, reconfiguring the cache directory to a second configuration. In the second configuration, each entry in the set of entries corresponds to a memory region having a second region size.
-
公开(公告)号:US11874774B2
公开(公告)日:2024-01-16
申请号:US17031834
申请日:2020-09-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Ravindra N. Bhargava , Ganesh Balakrishnan , Joe Sargunaraj , Chintan S. Patel , Girish Balaiah Aswathaiya , Vydhyanathan Kalyanasundharam
IPC: G06F12/08 , G06F12/0891 , G06F9/46 , G06F12/0813 , G06F12/0831 , G06F12/084
CPC classification number: G06F12/0891 , G06F9/467 , G06F12/084 , G06F12/0813 , G06F12/0833
Abstract: A method includes, in response to each write request of a plurality of write requests received at a memory-side cache device coupled with a memory device, writing payload data specified by the write request to the memory-side cache device, and when a first bandwidth availability condition is satisfied, performing a cache write-through by writing the payload data to the memory device, and recording an indication that the payload data written to the memory-side cache device matches the payload data written to the memory device.
-
公开(公告)号:US11809322B2
公开(公告)日:2023-11-07
申请号:US17472977
申请日:2021-09-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Vydhyanathan Kalyanasundharam , Kevin M. Lepak , Amit P. Apte , Ganesh Balakrishnan , Eric Christopher Morton , Elizabeth M. Cooper , Ravindra N. Bhargava
IPC: G06F12/0817 , G06F12/128 , G06F12/0811 , G06F12/0871 , G06F12/0831
CPC classification number: G06F12/0817 , G06F12/0811 , G06F12/0831 , G06F12/0871 , G06F12/128 , G06F2212/283 , G06F2212/604 , G06F2212/621
Abstract: Systems, apparatuses, and methods for maintaining a region-based cache directory are disclosed. A system includes multiple processing nodes, with each processing node including a cache subsystem. The system also includes a cache directory to help manage cache coherency among the different cache subsystems of the system. In order to reduce the number of entries in the cache directory, the cache directory tracks coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Accordingly, the system includes a region-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the system. The cache directory includes a reference count in each entry to track the aggregate number of cache lines that are cached per region. If a reference count of a given entry goes to zero, the cache directory reclaims the given entry.
-
公开(公告)号:US11507517B2
公开(公告)日:2022-11-22
申请号:US17033212
申请日:2020-09-25
Applicant: Advanced Micro Devices, Inc.
Inventor: Amit Apte , Ganesh Balakrishnan
IPC: G06F12/0895
Abstract: Disclosed is a cache directory including one or more cache directories configurable to interchange within each cache directory entry at least one bit between a first field and a second field to change the size of the region of memory represented and the number of cache lines tracked in the cache subsystem.
-
公开(公告)号:US10705959B2
公开(公告)日:2020-07-07
申请号:US16119438
申请日:2018-08-31
Applicant: Advanced Micro Devices, Inc.
Inventor: Vydhyanathan Kalyanasundharam , Kevin M. Lepak , Amit P. Apte , Ganesh Balakrishnan
IPC: G06F12/0817
Abstract: Systems, apparatuses, and methods for maintaining region-based cache directories split between node and memory are disclosed. The system with multiple processing nodes includes cache directories split between the nodes and memory to help manage cache coherency among the nodes' cache subsystems. In order to reduce the number of entries in the cache directories, the cache directories track coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Each processing node includes a node-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the node. The node-based cache directory includes a reference count field in each entry to track the aggregate number of cache lines that are cached per region. The memory-based cache directory includes entries for regions which have an entry stored in any node-based cache directory of the system.
-
公开(公告)号:US10366008B2
公开(公告)日:2019-07-30
申请号:US15376275
申请日:2016-12-12
Applicant: Advanced Micro Devices, Inc.
Inventor: Ganesh Balakrishnan , Vydhyanathan Kalyanasundharam , Kevin M. Lepak
IPC: G06F12/08 , G06F12/0853 , G06F12/0811 , G06F12/084
Abstract: A data processing system includes a processor and a cache controller coupled to the processor, and adapted to be coupled to a memory. The cache controller uses the memory to form a pseudo direct mapped cache having a plurality of groups of pages. The memory forms a first number of selected pages, including a first page for storing a plurality of sets of tags and a plurality of remaining pages for storing data. Each tag, of the plurality of sets of tags, stores tags for respective entries in a corresponding one of the plurality of remaining pages.
-
公开(公告)号:US20190196974A1
公开(公告)日:2019-06-27
申请号:US15855838
申请日:2017-12-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Vydhyanathan Kalyanasundharam , Kevin M. Lepak , Ganesh Balakrishnan , Ravindra N. Bhargava
IPC: G06F12/0897 , G06F12/121
Abstract: Systems, apparatuses, and methods for implementing a tag accelerator cache are disclosed. A system includes at least a data cache and a control unit coupled to the data cache via a memory controller. The control unit includes a tag accelerator cache (TAC) for caching tag blocks fetched from the data cache. The data cache is organized such that multiple tags are retrieved in a single access. This allows hiding the tag latency penalty for future accesses to neighboring tags and improves cache bandwidth. When a tag block is fetched from the data cache, the tag block is cached in the TAC. Memory requests received by the control unit first lookup the TAC before being forwarded to the data cache. Due to the presence of spatial locality in applications, the TAC can filter out a large percentage of tag accesses to the data cache, resulting in latency and bandwidth savings.
-
-
-
-
-
-
-
-
-