Patent search ap:("ADVANCED MICRO DEVICES Page INC.") AND inv:"Shaizeen Aga"

1.

发明授权
Device and method for accelerating matrix multiply operations 有权

公开(公告)号：US12124531B2

公开(公告)日：2024-10-22

申请号：US18297230

申请日：2023-04-07

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Nuwan Jayasena , Allen H. Rush , Michael Ignatowski

IPC: G06F17/16 , G06F7/53 , G06F15/80

CPC classification number: G06F17/16 , G06F7/5324 , G06F15/8007

Abstract: A processing device including a plurality of clusters of processor cores and a method for use in the processing device is disclosed. Each processor core in a cluster of processor cores is in communication with the other processor cores in the cluster and at least one processor core of each cluster is in communication with at least a processor core of a different cluster of processor cores. Each processor core is configured to store a product of a portion of a first matrix and a first portion of a second matrix in the memory, and store a product of the portion of the first matrix and a second portion of the second matrix in the memory, where the second portion of the second matrix is received from a processor core in the cluster of processor cores.

2.

发明授权
Dynamically coalescing atomic memory operations for memory-local computing 有权

公开(公告)号：US11726918B2

公开(公告)日：2023-08-15

申请号：US17361145

申请日：2021-06-28

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Johnathan Alsop , Alexandru Dutu , Shaizeen Aga , Nuwan Jayasena

IPC: G06F12/0871 , G06F12/02 , G06F12/084 , G06F12/0846

CPC classification number: G06F12/0871 , G06F12/0238 , G06F12/084 , G06F12/0846

Abstract: Dynamically coalescing atomic memory operations for memory-local computing is disclosed. In an embodiment, it is determined whether a first atomic memory access and a second atomic memory access are candidates for coalescing. In response to a triggering event, the atomic memory accesses that are candidates for coalescing are coalesced in a cache prior to requesting memory-local processing by a memory-local compute unit. The atomic memory accesses may be coalesced in the same cache line or atomic memory accesses in different cache lines may be coalesced using a multicast memory-local processing command.

3.

发明授权
Device and method for accelerating matrix multiply operations 有权

公开(公告)号：US11640444B2

公开(公告)日：2023-05-02

申请号：US17208526

申请日：2021-03-22

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Nuwan Jayasena , Allen H. Rush , Michael Ignatowski

IPC: G06F17/16 , G06F7/53 , G06F15/80

Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.

4.

发明申请
HARDWARE-SOFTWARE COLLABORATIVE ADDRESS MAPPING SCHEME FOR EFFICIENT PROCESSING-IN-MEMORY SYSTEMS 有权

公开(公告)号：US20220276795A1

公开(公告)日：2022-09-01

申请号：US17745278

申请日：2022-05-16

Applicant: Advanced Micro Devices, Inc.

Inventor： Mahzabeen Islam , Shaizeen Aga , Nuwan Jayasena , Jagadish B. Kotra

IPC: G06F3/06 , G06F12/02

Abstract: Approaches are provided for implementing hardware-software collaborative address mapping schemes that enable mapping data elements which are accessed together in the same row of one bank or over the same rows of different banks to achieve higher performance by reducing row conflicts. Using an intra-bank frame striping policy (IBFS), corresponding subsets of data elements are interleaved into a single row of a bank. Using an intra-channel frame striping policy (ICFS), corresponding subsets of data elements are interleaved into a single channel row of a channel. A memory controller utilizes ICFS and/or IBFS to efficiently store and access data elements in memory, such as processing-in-memory (PIM) enabled memory.

5.

发明授权
Detecting execution hazards in offloaded operations 有权

公开(公告)号：US11188406B1

公开(公告)日：2021-11-30

申请号：US17218506

申请日：2021-03-31

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Johnathan Alsop , Shaizeen Aga

IPC: G06F11/07 , G06F11/24 , G06F11/14

Abstract: Detecting execution hazards in offloaded operations is disclosed. A second offload operation is compared to a first offload operation that precedes the second offload operation. It is determined whether the second offload operation creates an execution hazard on an offload target device based on the comparison of the second offload operation to the first offload operation. If the execution hazard is detected, an error handling operation may be performed. In some examples, the offload operations are processing-in-memory operations.

6.

发明授权
Near-memory data reduction 有权

公开(公告)号：US11099788B2

公开(公告)日：2021-08-24

申请号：US16658733

申请日：2019-10-21

Applicant: Advanced Micro Devices, Inc.

Inventor： Nuwan Jayasena , Shaizeen Aga

IPC: G06F3/06 , H03M7/30

Abstract: An approach is provided for implementing near-memory data reduction during store operations to off-chip or off-die memory. A Near-Memory Reduction (NMR) unit provides near-memory data reduction during write operations to a specified address range. The NMR unit is configured with a range of addresses to be reduced and when a store operation specifies an address within the range of addresses, the NRM unit performs data reduction by adding the data value specified by the store operation to an accumulated reduction result. According to an embodiment, the NRM unit maintains a count of the number of updates to the accumulated reduction result that are used to determine when data reduction has been completed.

7.

发明授权
Approach for enabling concurrent execution of host memory commands and near-memory processing commands 有权

公开(公告)号：US11977782B2

公开(公告)日：2024-05-07

申请号：US17855442

申请日：2022-06-30

Applicant: Advanced Micro Devices, Inc.

Inventor： Mohamed Assem Abd ElMohsen Ibrahim , Meysam Taassori , Mahzabeen Islam , Shaizeen Aga

IPC: G06F3/06

CPC classification number: G06F3/0659 , G06F3/0613 , G06F3/0673

Abstract: An approach allows concurrent execution of near-memory processing commands, referred to herein as “PIM commands,” and host memory commands. A memory controller determines and issues a plurality of register-only PIM commands that do not reference memory with host memory commands to allow concurrent execution of the register-only PIM commands and the host memory commands. The approach allows concurrent execution of register-only PIM commands and host memory commands without interference, even when the register-only PIM commands and the host memory commands are interleaved, and even for the same memory module, which improves resource utilization and performance. Further improvement of resource utilization and performance is achieved by extending a register-only phase by reordering register-only PIM commands before non-register-only PIM commands, subject to dependency constraints, and using shadow row buffers to provide local working copies of data from memory to near-memory compute elements.

8.

发明公开
APPROACH FOR ENABLING CONCURRENT EXECUTION OF HOST MEMORY COMMANDS AND NEAR-MEMORY PROCESSING COMMANDS 审中-公开

公开(公告)号：US20240004585A1

公开(公告)日：2024-01-04

申请号：US17855442

申请日：2022-06-30

Applicant: Advanced Micro Devices, Inc.

Inventor： Mohamed Assem Abd ElMohsen Ibrahim , Meysam Taassori , Mahzabeen Islam , Shaizeen Aga

IPC: G06F3/06

CPC classification number: G06F3/0659 , G06F3/0613 , G06F3/0673

Abstract: An approach allows concurrent execution of near-memory processing commands, referred to herein as “PIM commands,” and host memory commands. A memory controller determines and issues a plurality of register-only PIM commands that do not reference memory with host memory commands to allow concurrent execution of the register-only PIM commands and the host memory commands. The approach allows concurrent execution of register-only PIM commands and host memory commands without interference, even when the register-only PIM commands and the host memory commands are interleaved, and even for the same memory module, which improves resource utilization and performance. Further improvement of resource utilization and performance is achieved by extending a register-only phase by reordering register-only PIM commands before non-register-only PIM commands, subject to dependency constraints, and using shadow row buffers to provide local working copies of data from memory to near-memory compute elements.

9.

发明公开
APPROACH FOR PROCESSING NEAR-MEMORY PROCESSING COMMANDS USING NEAR-MEMORY REGISTER DEFINITION DATA 审中-公开

公开(公告)号：US20230409238A1

公开(公告)日：2023-12-21

申请号：US17845263

申请日：2022-06-21

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Nuwan Jayasena

IPC: G06F3/06

CPC classification number: G06F3/0659 , G06F3/0673 , G06F3/0604

Abstract: An approach is provided for processing near-memory processing commands, e.g., PIM commands, using PIM register definition data that defines multiple combinations of source and/or destination registers to be used to process PIM commands. A particular combination of source and/or destination registers to be used to process a PIM command is specified by the PIM command or determined by a near-memory processing element processing the PIM command. According to another implementation, the PIM register definition data specifies an initial combination of source and/or destination registers and one or more update functions for each PIM command. A near-memory processing element processes a PIM command using the initial combination of source and/or destination registers and uses the one or more update functions to update the combination of source and/or destination registers to be used the next time the PIM command is processed.

10.

发明授权
Memory operations using compound memory commands 有权

公开(公告)号：US11669271B2

公开(公告)日：2023-06-06

申请号：US16848920

申请日：2020-04-15

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Anirban Nag , Nuwan Jayasena , Shaizeen Aga

IPC: G06F3/06 , G06F9/448 , G06F9/48

CPC classification number: G06F3/0659 , G06F3/0611 , G06F3/0614 , G06F3/0673 , G06F9/4498 , G06F9/4881

Abstract: Memory operations using compound memory commands, including: receiving, by a memory module, a compound memory command indicating one or more operations to be applied to each portion of a plurality of portions of contiguous memory in the memory module; generating, based on the compound memory command, a plurality of memory commands to apply the one or more operations to each portion of the plurality of portions of contiguous memory; and executing the plurality of memory commands.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification