-
公开(公告)号:US12111767B2
公开(公告)日:2024-10-08
申请号:US18302968
申请日:2023-04-19
Applicant: Advanced Micro Devices, Inc.
Inventor: Susumu Mashimo , John Kalamatianos
IPC: G06F12/0862 , G06F9/30 , G06F12/0877 , G06F18/214
CPC classification number: G06F12/0862 , G06F9/30036 , G06F9/30047 , G06F9/30101 , G06F12/0877 , G06F18/214 , G06F2212/6024
Abstract: A method includes recording a first set of consecutive memory access deltas, where each of the consecutive memory access deltas represents a difference between two memory addresses accessed by an application, updating values in a prefetch training table based on the first set of memory access deltas, and predicting one or more memory addresses for prefetching responsive to a second set of consecutive memory access deltas and based on values in the prefetch training table.
-
公开(公告)号:US12105957B2
公开(公告)日:2024-10-01
申请号:US18087964
申请日:2022-12-23
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Karthik Ramu Sangaiah , Anthony Thomas Gutierrez
CPC classification number: G06F3/061 , G06F3/0656 , G06F3/0659 , G06F3/0673
Abstract: A memory controller includes an arbiter, a vector arithmetic logic unit (VALU), a read buffer and a write buffer both coupled to the VALU, and an atomic memory operation scheduler. The VALU performs scattered atomic memory operations on arrays of data elements responsive to selected memory access commands. The atomic memory operation scheduler is for scheduling atomic memory operations at the VALU; identifying a plurality of scattered atomic memory operations with commutative and associative properties, the plurality of scattered atomic memory operations on at least one element of an array of data elements associated with an address; and commanding the VALU to perform the plurality of scattered atomic memory operations.
-
公开(公告)号:US12045169B2
公开(公告)日:2024-07-23
申请号:US17133581
申请日:2020-12-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Furkan Eris , Paul S. Keltcher , John Kalamatianos , Mayank Chhablani , Alok Garg
IPC: G06F12/0862 , G06F16/901 , G06N20/00
CPC classification number: G06F12/0862 , G06F16/9027 , G06N20/00
Abstract: Techniques for identifying a hardware configuration for operation are disclosed. The techniques include applying feature measurements to a trained model; obtaining output values from the trained model, the output values corresponding to different hardware configurations; and operating according to the output values, wherein the output values include one of a certainty score, a ranking, or a regression value.
-
4.
公开(公告)号:US20240202116A1
公开(公告)日:2024-06-20
申请号:US18068930
申请日:2022-12-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Jagadish B. Kotra , John Kalamatianos , Paul James Moyer , Nicholas Dean Lance , Sriram Srinivasan , Patrick James Shyvers , William Louie Walker
IPC: G06F12/0802
CPC classification number: G06F12/0802 , G06F2212/1016 , G06F2212/1028 , G06F2212/1044
Abstract: An entry of a last level cache shadow tag array to track pending last level cache misses to private data in a previous level cache (e.g., an L2 cache), that also are misses to an exclusive last level cache (e.g., an L3 cache) and to the last level cache shadow tag array. Accordingly, last level cache miss status holding registers need not be expended to track cache misses to private data that are already being tracked by a previous level cache miss status holding register. Additionally or alternatively, up to a threshold number of last level cache pending misses to the same shared data from different processor cores are tracked in the last level cache shadow tag array, and any additional last level cache pending misses are tracked in a last level cache miss status holding register.
-
公开(公告)号:US20240193097A1
公开(公告)日:2024-06-13
申请号:US18064155
申请日:2022-12-09
Applicant: Advanced Micro Devices, Inc.
Inventor: Jagadish B. Kotra , John Kalamatianos
IPC: G06F12/1045 , G06F12/0897
CPC classification number: G06F12/1045 , G06F12/0897
Abstract: Address translation is performed to translate a virtual address targeted by a memory request (e.g., a load or memory request for data or an instruction) to a physical address. This translation is performed using an address translation buffer, e.g., a translation lookaside buffer (TLB). One or more actions are taken to reduce data access latencies for memory requests in the event of a TLB miss where the virtual address to physical address translation is not in the TLB. Examples of actions that are performed in various implementations in response to a TLB miss include bypassing level 1 (L1) and level 2 (L2) caches in the memory system, and speculatively sending the memory request to the L2 cache while checking whether the memory request is satisfied by the L1 cache.
-
公开(公告)号:US20240111677A1
公开(公告)日:2024-04-04
申请号:US17957795
申请日:2022-09-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Gabriel H. Loh , Marko Scrbak , Akhil Arunkumar , John Kalamatianos
IPC: G06F12/0862 , G06F12/0877
CPC classification number: G06F12/0862 , G06F12/0877 , G06F12/0811
Abstract: A method for performing prefetching operations is disclosed. The method includes storing a recorded access pattern indicating a set of accesses for a region; in response to an access within the region, fetching the recorded access pattern; and performing prefetching based on the access pattern.
-
公开(公告)号:US11847062B2
公开(公告)日:2023-12-19
申请号:US17552703
申请日:2021-12-16
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Tarun Nakra , Jay Fleischman , Gautam Tarasingh Hazari , Akhil Arunkumar , William L. Walker , Gabriel H. Loh , John Kalamatianos , Marko Scrbak
IPC: G06F12/0897 , G06F12/0891
CPC classification number: G06F12/0897 , G06F12/0891 , G06F2212/1028
Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.
-
公开(公告)号:US11726868B2
公开(公告)日:2023-08-15
申请号:US17113815
申请日:2020-12-07
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Michael Mantor , Sudhanva Gurumurthi
IPC: G06F11/10 , G06F11/16 , G06F12/0866 , G06F11/00 , H03M13/00
CPC classification number: G06F11/1064 , G06F11/1629 , G06F11/1641 , G06F11/1654 , G06F12/0866 , G06F2212/1032 , G06F2212/281 , G06F2212/403
Abstract: A system and method for protecting memory instructions against faults are described. The system and method include converting the slave instructions to dummy operations, modifying memory arbiter to issue up to N master and N slave global/shared memory instructions per cycle, sending master memory requests to memory system, using slave requests for error checking, entering master requests to the GM/LM FIFO, storing slave requests in a register, and comparing the entered master requests with the stored slave requests.
-
公开(公告)号:US20230195643A1
公开(公告)日:2023-06-22
申请号:US17552703
申请日:2021-12-16
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Tarun Nakra , Jay Fleischman , Gautam Tarasingh Hazari , Akhil Arunkumar , William L. Walker , Gabriel H. Loh , John Kalamatianos , Marko Scrbak
IPC: G06F12/0897 , G06F12/0891
CPC classification number: G06F12/0897 , G06F12/0891 , G06F2212/1028
Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.
-
公开(公告)号:US11586441B2
公开(公告)日:2023-02-21
申请号:US17125730
申请日:2020-12-17
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Jagadish B. Kotra
IPC: G06F9/38 , G06F12/0897 , G06F12/0875 , G06F9/30
Abstract: Systems, apparatuses, and methods for virtualizing a micro-operation cache are disclosed. A processor includes at least a micro-operation cache, a conventional cache subsystem, a decode unit, and control logic. The decode unit decodes instructions into micro-operations which are then stored in the micro-operation cache. The micro-operation cache has limited capacity for storing micro-operations. When new micro-operations are decoded from pending instructions, existing micro-operations are evicted from the micro-operation cache to make room for the new micro-operations. Rather than being discarded, micro-operations evicted from the micro-operation cache are stored in the conventional cache subsystem. This prevents the original instruction from having to be decoded again on subsequent executions. When the control logic determines that micro-operations for one or more fetched instructions are stored in either the micro-operation cache or the conventional cache subsystem, the control logic causes the decode unit to transition to a reduced-power state.
-
-
-
-
-
-
-
-
-