-
公开(公告)号:US20240103763A1
公开(公告)日:2024-03-28
申请号:US17953723
申请日:2022-09-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Mahzabeen Islam , Shaizeen Dilawarhusen Aga , Johnathan Robert Alsop , MOHAMED ASSEM ABD ELMOHSEN IBRAHIM , Nuwan S Jayasena
IPC: G06F3/06
CPC classification number: G06F3/0659 , G06F3/0611 , G06F3/0673
Abstract: In accordance with the described techniques for bank-level parallelism for processing in memory, a plurality of commands are received for execution by a processing in memory component embedded in a memory. The memory includes a first bank and a second bank. The plurality of commands include a first stream of commands which cause the processing in memory component to perform operations that access the first bank and a second stream of commands which cause the processing in memory component to perform operations that access the second bank. A next row of the first bank that is to be accessed by the processing in memory component is identified. Further, a precharge command is scheduled to close a first row of the first bank and an activate command is scheduled to open the next row of the first bank in parallel with execution of the second stream of commands.
-
公开(公告)号:US12236134B2
公开(公告)日:2025-02-25
申请号:US17953723
申请日:2022-09-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Mahzabeen Islam , Shaizeen Dilawarhusen Aga , Johnathan Robert Alsop , Mohamed Assem Abd ElMohsen Ibrahim , Nuwan S Jayasena
IPC: G06F3/06
Abstract: In accordance with the described techniques for bank-level parallelism for processing in memory, a plurality of commands are received for execution by a processing in memory component embedded in a memory. The memory includes a first bank and a second bank. The plurality of commands include a first stream of commands which cause the processing in memory component to perform operations that access the first bank and a second stream of commands which cause the processing in memory component to perform operations that access the second bank. A next row of the first bank that is to be accessed by the processing in memory component is identified. Further, a precharge command is scheduled to close a first row of the first bank and an activate command is scheduled to open the next row of the first bank in parallel with execution of the second stream of commands.
-
3.
公开(公告)号:US11487447B2
公开(公告)日:2022-11-01
申请号:US17006646
申请日:2020-08-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Mahzabeen Islam , Shaizeen Aga , Nuwan Jayasena , Jagadish B. Kotra
Abstract: Approaches are provided for implementing hardware-software collaborative address mapping schemes that enable mapping data elements which are accessed together in the same row of one bank or over the same rows of different banks to achieve higher performance by reducing row conflicts. Using an intra-bank frame striping policy (IBFS), corresponding subsets of data elements are interleaved into a single row of a bank. Using an intra-channel frame striping policy (ICFS), corresponding subsets of data elements are interleaved into a single channel row of a channel. A memory controller utilizes ICFS and/or IBFS to efficiently store and access data elements in memory, such as processing-in-memory (PIM) enabled memory.
-
4.
公开(公告)号:US11797201B2
公开(公告)日:2023-10-24
申请号:US17745278
申请日:2022-05-16
Applicant: Advanced Micro Devices, Inc.
Inventor: Mahzabeen Islam , Shaizeen Aga , Nuwan Jayasena , Jagadish B. Kotra
CPC classification number: G06F3/0631 , G06F3/0604 , G06F3/0673 , G06F12/0207 , G06F12/0223 , G06F12/0607
Abstract: Approaches are provided for implementing hardware-software collaborative address mapping schemes that enable mapping data elements which are accessed together in the same row of one bank or over the same rows of different banks to achieve higher performance by reducing row conflicts. Using an intra-bank frame striping policy (IBFS), corresponding subsets of data elements are interleaved into a single row of a bank. Using an intra-channel frame striping policy (ICFS), corresponding subsets of data elements are interleaved into a single channel row of a channel. A memory controller utilizes ICFS and/or IBFS to efficiently store and access data elements in memory, such as processing-in-memory (PIM) enabled memory.
-
5.
公开(公告)号:US20220066662A1
公开(公告)日:2022-03-03
申请号:US17006646
申请日:2020-08-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Mahzabeen Islam , Shaizeen Aga , Nuwan Jayasena , Jagadish B. Kotra
IPC: G06F3/06
Abstract: Approaches are provided for implementing hardware-software collaborative address mapping schemes that enable mapping data elements which are accessed together in the same row of one bank or over the same rows of different banks to achieve higher performance by reducing row conflicts. Using an intra-bank frame striping policy (IBFS), corresponding subsets of data elements are interleaved into a single row of a bank. Using an intra-channel frame striping policy (ICFS), corresponding subsets of data elements are interleaved into a single channel row of a channel. A memory controller utilizes ICFS and/or IBFS to efficiently store and access data elements in memory, such as processing-in-memory (PIM) enabled memory.
-
公开(公告)号:US20240411462A1
公开(公告)日:2024-12-12
申请号:US18207314
申请日:2023-06-08
Applicant: Advanced Micro Devices, Inc.
IPC: G06F3/06
Abstract: Local and dynamic triggering of operations executed by a processing-in-memory component is described. In accordance with the described techniques, a processing-in-memory component receives a command from a host for execution by the processing-in-memory component. The processing-in-memory component references a tracking table that includes at least one entry associated with an operation performed as part of executing the command and identifies at least one additional command to be triggered locally after executing the command received from the host. Responsive to identifying that conditions associated with the at least one additional command are satisfied, the processing-in-memory component executes the at least one additional command, independent of instructions from the host.
-
公开(公告)号:US11977782B2
公开(公告)日:2024-05-07
申请号:US17855442
申请日:2022-06-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Mohamed Assem Abd ElMohsen Ibrahim , Meysam Taassori , Mahzabeen Islam , Shaizeen Aga
IPC: G06F3/06
CPC classification number: G06F3/0659 , G06F3/0613 , G06F3/0673
Abstract: An approach allows concurrent execution of near-memory processing commands, referred to herein as “PIM commands,” and host memory commands. A memory controller determines and issues a plurality of register-only PIM commands that do not reference memory with host memory commands to allow concurrent execution of the register-only PIM commands and the host memory commands. The approach allows concurrent execution of register-only PIM commands and host memory commands without interference, even when the register-only PIM commands and the host memory commands are interleaved, and even for the same memory module, which improves resource utilization and performance. Further improvement of resource utilization and performance is achieved by extending a register-only phase by reordering register-only PIM commands before non-register-only PIM commands, subject to dependency constraints, and using shadow row buffers to provide local working copies of data from memory to near-memory compute elements.
-
8.
公开(公告)号:US20240004585A1
公开(公告)日:2024-01-04
申请号:US17855442
申请日:2022-06-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Mohamed Assem Abd ElMohsen Ibrahim , Meysam Taassori , Mahzabeen Islam , Shaizeen Aga
IPC: G06F3/06
CPC classification number: G06F3/0659 , G06F3/0613 , G06F3/0673
Abstract: An approach allows concurrent execution of near-memory processing commands, referred to herein as “PIM commands,” and host memory commands. A memory controller determines and issues a plurality of register-only PIM commands that do not reference memory with host memory commands to allow concurrent execution of the register-only PIM commands and the host memory commands. The approach allows concurrent execution of register-only PIM commands and host memory commands without interference, even when the register-only PIM commands and the host memory commands are interleaved, and even for the same memory module, which improves resource utilization and performance. Further improvement of resource utilization and performance is achieved by extending a register-only phase by reordering register-only PIM commands before non-register-only PIM commands, subject to dependency constraints, and using shadow row buffers to provide local working copies of data from memory to near-memory compute elements.
-
公开(公告)号:US20240103745A1
公开(公告)日:2024-03-28
申请号:US17954784
申请日:2022-09-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Niti Madan , Johnathan Robert Alsop , Alexandru Dutu , Mahzabeen Islam , Yasuko Eckert , Nuwan S Jayasena
IPC: G06F3/06
CPC classification number: G06F3/064 , G06F3/0659 , G06F3/0604 , G06F3/0679
Abstract: A memory controller coupled to a memory module receives both processing-in-memory (PIM) requests and memory requests from a host (e.g., a host processor). The memory controller issues PIM requests to one group of memory banks and concurrently issues memory requests to one or more other groups of memory banks. Accordingly, memory requests are performed on groups of memory banks that would otherwise be idle while PIM requests are performed on the one group of memory banks. Optionally, the memory controller coupled to the memory module also takes various actions when switching between operating in a PIM mode and a non-processing-in-memory mode to reduce or hide overhead when switching between the two modes.
-
公开(公告)号:US11726783B2
公开(公告)日:2023-08-15
申请号:US16856832
申请日:2020-04-23
Applicant: Advanced Micro Devices, Inc.
Inventor: Marko Scrbak , Mahzabeen Islam , John Kalamatianos , Jagadish B. Kotra
IPC: G06F9/38 , G06F9/26 , G06F16/901 , G06F12/0893
CPC classification number: G06F9/264 , G06F9/262 , G06F9/3808 , G06F9/3887 , G06F12/0893 , G06F16/9017
Abstract: A processor includes a micro-operation cache having a plurality of micro-operation cache entries for storing micro-operations decoded from instruction groups and a micro-operation filter having a plurality of micro-operation filter table entries for storing identifiers of instruction groups for which the micro-operations are predicted dead on fill if stored in the micro-operation cache. The micro-operation filter receives an identifier for an instruction group. The micro-operation filter then prevents a copy of the micro-operations from the first instruction group from being stored in the micro-operation cache when a micro-operation filter table entry includes an identifier that matches the first identifier.
-
-
-
-
-
-
-
-
-