Patent search ap:("ADVANCED MICRO DEVICES Page INC.") AND inv:"Nuwan Jayasena"

1.

发明授权
Approach for processing near-memory processing commands using near-memory register definition data 有权

公开(公告)号：US12265735B2

公开(公告)日：2025-04-01

申请号：US17845263

申请日：2022-06-21

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Nuwan Jayasena

IPC: G06F3/06

Abstract: An approach is provided for processing near-memory processing commands, e.g., PIM commands, using PIM register definition data that defines multiple combinations of source and/or destination registers to be used to process PIM commands. A particular combination of source and/or destination registers to be used to process a PIM command is specified by the PIM command or determined by a near-memory processing element processing the PIM command. According to another implementation, the PIM register definition data specifies an initial combination of source and/or destination registers and one or more update functions for each PIM command. A near-memory processing element processes a PIM command using the initial combination of source and/or destination registers and uses the one or more update functions to update the combination of source and/or destination registers to be used the next time the PIM command is processed.

2.

发明授权
Dynamic hardware selection for experts in mixture-of-experts model 有权

公开(公告)号：US11893502B2

公开(公告)日：2024-02-06

申请号：US15849633

申请日：2017-12-20

Applicant: Advanced Micro Devices, Inc.

Inventor： Nicholas Malaya , Nuwan Jayasena

IPC: G06N5/022 , G06N20/00 , G06F7/02

CPC classification number: G06N5/022 , G06N20/00 , G06F7/02

Abstract: A system assigns experts of a mixture-of-experts artificial intelligence model to processing devices in an automated manner. The system includes an orchestrator component that maintains priority data that stores, for each of a set of experts, and for each of a set of execution parameters, ranking information that ranks different processing devices for the particular execution parameter. In one example, for the execution parameter of execution speed, and for a first expert, the priority data indicates that a central processing unit (“CPU”) executes the first expert faster than a graphics processing unit (“GPU”). In this example, for the execution parameter of power consumption, and for the first expert, the priority data indicates that a GPU uses less power than a CPU. The priority data stores such information for one or more processing devices, one or more experts, and one or more execution characteristics.

3.

发明公开
DATA ENCRYPTION SUITABLE FOR USE IN SYSTEMS WITH PROCESSING-IN-MEMORY 审中-公开

公开(公告)号：US20240004801A1

公开(公告)日：2024-01-04

申请号：US17853340

申请日：2022-06-29

Applicant: Advanced Micro Devices, Inc.

Inventor： Nuwan Jayasena , Shaizeen Dilawarhusen Aga , Vignesh Adhinarayanan

IPC: G06F12/14 , G06F9/30

CPC classification number: G06F12/1408 , G06F9/3004 , G06F9/30029

Abstract: An encryption circuit includes an iterative block cipher circuit. The iterative block cipher circuit has a counter input for a row index, a key input for receiving a secret key, and an output for providing an encrypted counter value in response to performing a block cipher process using the row index as a counter the secret key. The encryption circuit uses the iterative block cipher circuit during a row operation to a memory.

4.

发明授权
Approach for reducing side effects of computation offload to memory 有权

公开(公告)号：US11847055B2

公开(公告)日：2023-12-19

申请号：US17364854

申请日：2021-06-30

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Nuwan Jayasena

IPC: G06F12/00 , G06F12/084 , G06F12/02 , G06F12/0862 , G06F12/0811

CPC classification number: G06F12/084 , G06F12/0238 , G06F12/0811 , G06F12/0862

Abstract: A technical solution to the technical problem of how to reduce the undesirable side effects of offloading computations to memory uses read hints to preload results of memory-side processing into a processor-side cache. A cache controller, in response to identifying a read hint in a memory-side processing instruction, causes results of the memory-side processing to be preloaded into a processor-side cache. Implementations include, without limitation, enabling or disabling the preloading based upon cache thrashing levels, preloading results, or portions of results, of memory-side processing to particular destination caches, preloading results based upon priority and/or degree of confidence, and/or during periods of low data bus and/or command bus utilization, last stores considerations, and enforcing an ordering constraint to ensure that preloading occurs after memory-side processing results are complete.

5.

发明授权
System and method for coalesced multicast data transfers over memory interfaces 有权

公开(公告)号：US11803311B2

公开(公告)日：2023-10-31

申请号：US17218700

申请日：2021-03-31

Applicant: Advanced Micro Devices, Inc.

Inventor： Johnathan Alsop , Nuwan Jayasena , Shaizeen Aga , Andrew McCrabb

IPC: G06F3/06

CPC classification number: G06F3/064 , G06F3/0604 , G06F3/0644 , G06F3/0659 , G06F3/0679

Abstract: Methods and apparatuses to control digital data transfer via a memory channel between a memory module and a processor are disclosed. At least one of the memory module or the processor coalesces a plurality of short data words into multicast coalesced block data comprising a single data block for transfer via the memory channel. Each of the plurality of short data words pertains to one of at least two partitioned memory submodules in the memory module. The multicast coalesced block data is communicated over the memory channel.

6.

发明公开
DEVICE AND METHOD FOR ACCELERATING MATRIX MULTIPLY OPERATIONS 审中-公开

公开(公告)号：US20230244751A1

公开(公告)日：2023-08-03

申请号：US18297230

申请日：2023-04-07

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Nuwan Jayasena , Allen H. Rush , Michael Ignatowski

IPC: G06F17/16 , G06F7/53 , G06F15/80

CPC classification number: G06F17/16 , G06F7/5324 , G06F15/8007

Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.

7.

发明公开
APPROACH FOR PERFORMING EFFICIENT MEMORY OPERATIONS USING NEAR-MEMORY COMPUTE ELEMENTS 审中-公开

公开(公告)号：US20230195618A1

公开(公告)日：2023-06-22

申请号：US17557568

申请日：2021-12-21

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Johnathan Alsop , Nuwan Jayasena

IPC: G06F12/06

CPC classification number: G06F12/06

Abstract: Near-memory compute elements perform memory operations and temporarily store at least a portion of address information for the memory operations in local storage. A broadcast memory command is then issued to the near-memory compute elements that causes the near-memory compute elements to perform a subsequent memory operation using their respective address information stored in the local storage. This allows a single broadcast memory command to be used to perform memory operations across multiple memory elements, such as DRAM banks, using bank-specific address information. In one implementation, the approach is used to process workloads with irregular updates to memory while consuming less command bus bandwidth than conventional approaches. Implementations include using conditional flags to selectively designate address information in local storage that is to be processed with the broadcast memory command.

8.

发明授权
Method and apparatus to support instruction replay for executing idempotent code in dependent processing in memory devices 有权

公开(公告)号：US11656945B2

公开(公告)日：2023-05-23

申请号：US17133843

申请日：2020-12-24

Applicant: Advanced Micro Devices, Inc.

Inventor： John Kalamatianos , Nuwan Jayasena , Sudhanva Gurumurthi , Shaizeen Aga , Shrikanth Ganapathy

IPC: G06F11/14 , G06F9/38

CPC classification number: G06F11/141 , G06F9/3877

Abstract: Methods and processing devices are provided for error protection to support instruction replay for executing idempotent instructions at a processing in memory PIM device. The processing apparatus includes a PIM device configured to execute an idempotent instruction. The processing apparatus also includes a processor, in communication with the PIM device, configured to issue the idempotent instruction to the PIM device for execution at the PIM device and reissue the idempotent instruction to the PIM device when one of execution of the idempotent instruction at the PIM device results in an error and a predetermined latency period expires from when the idempotent instruction is issued.

9.

发明授权
Hardware-software collaborative address mapping scheme for efficient processing-in-memory systems 有权

公开(公告)号：US11487447B2

公开(公告)日：2022-11-01

申请号：US17006646

申请日：2020-08-28

Applicant: Advanced Micro Devices, Inc.

Inventor： Mahzabeen Islam , Shaizeen Aga , Nuwan Jayasena , Jagadish B. Kotra

IPC: G06F12/00 , G06F3/06 , G06F12/02

Abstract: Approaches are provided for implementing hardware-software collaborative address mapping schemes that enable mapping data elements which are accessed together in the same row of one bank or over the same rows of different banks to achieve higher performance by reducing row conflicts. Using an intra-bank frame striping policy (IBFS), corresponding subsets of data elements are interleaved into a single row of a bank. Using an intra-channel frame striping policy (ICFS), corresponding subsets of data elements are interleaved into a single channel row of a channel. A memory controller utilizes ICFS and/or IBFS to efficiently store and access data elements in memory, such as processing-in-memory (PIM) enabled memory.

10.

发明申请
APPROACH FOR ENFORCING ORDERING BETWEEN MEMORY-CENTRIC AND CORE-CENTRIC MEMORY OPERATIONS 有权

公开(公告)号：US20220317926A1

公开(公告)日：2022-10-06

申请号：US17219446

申请日：2021-03-31

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Nuwan Jayasena , Johnathan Alsop

IPC: G06F3/06

Abstract: Ordering between memory-centric memory operations, referred to hereinafter as “MC-Mem-Ops,” and core-centric memory operations, referred to hereinafter as “CC-Mem-Ops,” is enforced using inter-centric fences, referred to hereinafter as an “IC-fences.” IC-fences are implemented by an ordering primitive or ordering instruction, that cause a memory controller, a cache controller, etc., to enforce ordering of MC-Mem-Ops and CC-Mem-Ops throughout the memory pipeline and at the memory controller by not reordering MC-Mem-Ops (or sometimes CC-Mem-Ops) that arrive before the IC-fence to after the IC-fence. Processing of an IC-fence also causes the memory controller to issue an ordering acknowledgment to the thread that issued the IC-fence instruction. IC-fences are tracked at the core and designated as complete when the ordering acknowledgment is received. Embodiments include a completion level-specific cache flush operation which, when used with an IC-fence, provides proper ordering between cached CC-Mem-Ops and MC-Mem-ops with reduced data transfer and completion times.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification