-
公开(公告)号:US20250006232A1
公开(公告)日:2025-01-02
申请号:US18346110
申请日:2023-06-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Ioannis Papadopoulos , Vignesh Adhinarayanan , Ashwin Aji , Jagadish B. Kotra
Abstract: An apparatus and method for creating less computationally intensive nodes for a neural network. An integrated circuit includes a host processor and multiple memory channels, each with multiple memory array banks. Each of the memory array banks includes components of a processing-in-memory (PIM) accelerator and a scatter and gather circuit used to dynamically perform quantization operations and dequantization operations that offload these operations from the host processor. The host processor executes a data model that represents a neural network. The memory array banks store a single copy of a particular data value in a single precision. Therefore, the memory array banks avoid storing replications of the same data value with different precisions to be used by a neural network node. The memory array banks dynamically perform quantization operations and dequantization operations on one or more of the weight values, input data values, and activation output values of the neural network.
-
2.
公开(公告)号:US20240202116A1
公开(公告)日:2024-06-20
申请号:US18068930
申请日:2022-12-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Jagadish B. Kotra , John Kalamatianos , Paul James Moyer , Nicholas Dean Lance , Sriram Srinivasan , Patrick James Shyvers , William Louie Walker
IPC: G06F12/0802
CPC classification number: G06F12/0802 , G06F2212/1016 , G06F2212/1028 , G06F2212/1044
Abstract: An entry of a last level cache shadow tag array to track pending last level cache misses to private data in a previous level cache (e.g., an L2 cache), that also are misses to an exclusive last level cache (e.g., an L3 cache) and to the last level cache shadow tag array. Accordingly, last level cache miss status holding registers need not be expended to track cache misses to private data that are already being tracked by a previous level cache miss status holding register. Additionally or alternatively, up to a threshold number of last level cache pending misses to the same shared data from different processor cores are tracked in the last level cache shadow tag array, and any additional last level cache pending misses are tracked in a last level cache miss status holding register.
-
公开(公告)号:US20240193097A1
公开(公告)日:2024-06-13
申请号:US18064155
申请日:2022-12-09
Applicant: Advanced Micro Devices, Inc.
Inventor: Jagadish B. Kotra , John Kalamatianos
IPC: G06F12/1045 , G06F12/0897
CPC classification number: G06F12/1045 , G06F12/0897
Abstract: Address translation is performed to translate a virtual address targeted by a memory request (e.g., a load or memory request for data or an instruction) to a physical address. This translation is performed using an address translation buffer, e.g., a translation lookaside buffer (TLB). One or more actions are taken to reduce data access latencies for memory requests in the event of a TLB miss where the virtual address to physical address translation is not in the TLB. Examples of actions that are performed in various implementations in response to a TLB miss include bypassing level 1 (L1) and level 2 (L2) caches in the memory system, and speculatively sending the memory request to the L2 cache while checking whether the memory request is satisfied by the L1 cache.
-
公开(公告)号:US11586441B2
公开(公告)日:2023-02-21
申请号:US17125730
申请日:2020-12-17
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Jagadish B. Kotra
IPC: G06F9/38 , G06F12/0897 , G06F12/0875 , G06F9/30
Abstract: Systems, apparatuses, and methods for virtualizing a micro-operation cache are disclosed. A processor includes at least a micro-operation cache, a conventional cache subsystem, a decode unit, and control logic. The decode unit decodes instructions into micro-operations which are then stored in the micro-operation cache. The micro-operation cache has limited capacity for storing micro-operations. When new micro-operations are decoded from pending instructions, existing micro-operations are evicted from the micro-operation cache to make room for the new micro-operations. Rather than being discarded, micro-operations evicted from the micro-operation cache are stored in the conventional cache subsystem. This prevents the original instruction from having to be decoded again on subsequent executions. When the control logic determines that micro-operations for one or more fetched instructions are stored in either the micro-operation cache or the conventional cache subsystem, the control logic causes the decode unit to transition to a reduced-power state.
-
公开(公告)号:US20220188208A1
公开(公告)日:2022-06-16
申请号:US17118404
申请日:2020-12-10
Applicant: Advanced Micro Devices, Inc.
Inventor: Anthony Gutierrez , Yasuko Eckert , Sergey Blagodurov , Jagadish B. Kotra
IPC: G06F11/30 , G06F1/20 , G06F12/0815 , G11C11/406 , G06F9/48 , G06F9/30
Abstract: A method may include, in response to a change in an operating parameter of a processing unit, modifying a signal pathway to a processing circuit component of the processing unit, and communicating with the processing circuit component via the signal pathway.
-
公开(公告)号:US12189953B2
公开(公告)日:2025-01-07
申请号:US17956417
申请日:2022-09-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Jagadish B. Kotra , John Kalamatianos
Abstract: Methods, devices, and systems for retrieving information based on cache miss prediction. It is predicted, based on a history of cache misses at a private cache, that a cache lookup for the information will miss a shared victim cache. A speculative memory request is enabled based on the prediction that the cache lookup for the information will miss the shared victim cache. The information is fetched based on the enabled speculative memory request.
-
公开(公告)号:US12099723B2
公开(公告)日:2024-09-24
申请号:US17956614
申请日:2022-09-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Jagadish B. Kotra , Marko Scrbak
IPC: G06F3/06
CPC classification number: G06F3/0613 , G06F3/0659 , G06F3/0679
Abstract: A method for operating a memory having a plurality of banks accessible in parallel, each bank including a plurality of grains accessible in parallel is provided. The method includes: based on a memory access request that specifies a memory address, identifying a set that stores data for the memory access request, wherein the set is spread across multiple grains of the plurality of grains; and performing operations to satisfy the memory access request, using entries of the set stored across the multiple grains of the plurality of grains.
-
公开(公告)号:US12019566B2
公开(公告)日:2024-06-25
申请号:US16938364
申请日:2020-07-24
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Sergey Blagodurov , Johnathan Alsop , Jagadish B. Kotra , Marko Scrbak , Ganesh Dasika
IPC: G06F13/16 , G06F9/30 , H04L45/122
CPC classification number: G06F13/1642 , G06F9/3004 , G06F9/30098 , G06F13/1663 , H04L45/122
Abstract: Arbitrating atomic memory operations, including: receiving, by a media controller, a plurality of atomic memory operations; determining, by an atomics controller associated with the media controller, based on one or more arbitration rules, an ordering for issuing the plurality of atomic memory operations; and issuing the plurality of atomic memory operations to a memory module according to the ordering.
-
公开(公告)号:US20240193292A1
公开(公告)日:2024-06-13
申请号:US18212858
申请日:2023-06-22
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Jagadish B. Kotra , David Kaplan , Kishore Punniyamurthy , Alexander Toufic Freij
IPC: G06F21/62
CPC classification number: G06F21/6218 , G06F2221/2113 , G06F2221/2141
Abstract: A processing system receives graph object data and graph object metadata. The processing system stores the graph object metadata inline with the graph object data. The graph object metadata indicates access permissions for corresponding graph objects. Because the graph object metadata is stored inline with the graph object data, the graph object metadata is more easily retrieved and fewer system resources are consumed to determine access permissions of a requester as compared to a system where graph object metadata is stored separately from the graph object data.
-
公开(公告)号:US20240111420A1
公开(公告)日:2024-04-04
申请号:US17956417
申请日:2022-09-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Jagadish B. Kotra , John Kalamatianos
IPC: G06F3/06
CPC classification number: G06F3/0611 , G06F3/0653 , G06F3/0673
Abstract: Methods, devices, and systems for retrieving information based on cache miss prediction. It is predicted, based on a history of cache misses at a private cache, that a cache lookup for the information will miss a shared victim cache. A speculative memory request is enabled based on the prediction that the cache lookup for the information will miss the shared victim cache. The information is fetched based on the enabled speculative memory request.
-
-
-
-
-
-
-
-
-