-
公开(公告)号:US20210374055A1
公开(公告)日:2021-12-02
申请号:US16887713
申请日:2020-05-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Nuwan Jayasena , Johnathan Alsop
IPC: G06F12/06 , G11C11/408
Abstract: A memory controller may be configured with command logic that is capable of sending a memory access command having incomplete address information via a command/address bus that connects the memory controller to memory modules. The memory controller may send the memory access command via the bus for accessing data stored at memory locations of the memory modules. The memory locations may correspond to different near-memory generated reflecting that the data is not address aligned across the memory modules. Nonetheless, because of the near-memory address generation, the memory controller can send the memory access command having incomplete address information for accessing the data stored at the different addresses, as opposed to having to send multiple memory access commands specifying complete address information on the bus for accessing the data at the different addresses, thereby conserving usage of the available bus bandwidth, reducing power consumption, and increasing compute throughput.
-
公开(公告)号:US20210209192A1
公开(公告)日:2021-07-08
申请号:US17208526
申请日:2021-03-22
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Nuwan Jayasena , Allen H. Rush , Michael Ignatowski
Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
-
公开(公告)号:US10782918B2
公开(公告)日:2020-09-22
申请号:US16123837
申请日:2018-09-06
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Nuwan Jayasena
Abstract: Methods, systems, and devices for near-memory data-dependent gathering and packing of data stored in a memory. A processing device extracts a function, a memory source address, and a memory destination address from a near-memory data-dependent gathering and packing primitive. A signal to perform gathering and packing operations based on the primitive is sent to near-memory processing circuitry of a memory device. The near-memory processing circuitry receives the signal, gathers data from the memory device based on the function and the memory source address, and packs the gathered data into the memory device based on the memory destination address.
-
公开(公告)号:US11966328B2
公开(公告)日:2024-04-23
申请号:US17126977
申请日:2020-12-18
Applicant: Advanced Micro Devices, Inc.
Inventor: Onur Kayiran , Mohamed Assem Ibrahim , Shaizeen Aga
IPC: G06F12/06
CPC classification number: G06F12/06 , G06F2212/1041
Abstract: A memory module includes register selection logic to select alternate local source and/or destination registers to process PIM commands. The register selection logic uses an address-based register selection approach to select an alternate local source and/or destination register based upon address data specified by a PIM command and a split address maintained by a memory module. The register selection logic may alternatively use a register data-based approach to select an alternate local source and/or destination register based upon data stored in one or more local registers. A PIM-enabled memory module configured with the register selection logic described herein is capable of selecting an alternate local source and/or destination register to process PIM commands at or near the PIM execution unit where the PIM commands are executed.
-
公开(公告)号:US11847055B2
公开(公告)日:2023-12-19
申请号:US17364854
申请日:2021-06-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Nuwan Jayasena
IPC: G06F12/00 , G06F12/084 , G06F12/02 , G06F12/0862 , G06F12/0811
CPC classification number: G06F12/084 , G06F12/0238 , G06F12/0811 , G06F12/0862
Abstract: A technical solution to the technical problem of how to reduce the undesirable side effects of offloading computations to memory uses read hints to preload results of memory-side processing into a processor-side cache. A cache controller, in response to identifying a read hint in a memory-side processing instruction, causes results of the memory-side processing to be preloaded into a processor-side cache. Implementations include, without limitation, enabling or disabling the preloading based upon cache thrashing levels, preloading results, or portions of results, of memory-side processing to particular destination caches, preloading results based upon priority and/or degree of confidence, and/or during periods of low data bus and/or command bus utilization, last stores considerations, and enforcing an ordering constraint to ensure that preloading occurs after memory-side processing results are complete.
-
公开(公告)号:US11803311B2
公开(公告)日:2023-10-31
申请号:US17218700
申请日:2021-03-31
Applicant: Advanced Micro Devices, Inc.
Inventor: Johnathan Alsop , Nuwan Jayasena , Shaizeen Aga , Andrew McCrabb
IPC: G06F3/06
CPC classification number: G06F3/064 , G06F3/0604 , G06F3/0644 , G06F3/0659 , G06F3/0679
Abstract: Methods and apparatuses to control digital data transfer via a memory channel between a memory module and a processor are disclosed. At least one of the memory module or the processor coalesces a plurality of short data words into multicast coalesced block data comprising a single data block for transfer via the memory channel. Each of the plurality of short data words pertains to one of at least two partitioned memory submodules in the memory module. The multicast coalesced block data is communicated over the memory channel.
-
公开(公告)号:US20230244751A1
公开(公告)日:2023-08-03
申请号:US18297230
申请日:2023-04-07
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Nuwan Jayasena , Allen H. Rush , Michael Ignatowski
CPC classification number: G06F17/16 , G06F7/5324 , G06F15/8007
Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
-
公开(公告)号:US11693725B2
公开(公告)日:2023-07-04
申请号:US17536817
申请日:2021-11-29
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Johnathan Alsop , Shaizeen Aga
CPC classification number: G06F11/0772 , G06F11/141 , G06F11/1471 , G06F11/24
Abstract: Detecting execution hazards in offloaded operations is disclosed. A second offload operation is compared to a first offload operation that precedes the second offload operation. It is determined whether the second offload operation creates an execution hazard on an offload target device based on the comparison of the second offload operation to the first offload operation. If the execution hazard is detected, an error handling operation may be performed. In some examples, the offload operations are processing-in-memory operations.
-
29.
公开(公告)号:US20230195618A1
公开(公告)日:2023-06-22
申请号:US17557568
申请日:2021-12-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Johnathan Alsop , Nuwan Jayasena
IPC: G06F12/06
CPC classification number: G06F12/06
Abstract: Near-memory compute elements perform memory operations and temporarily store at least a portion of address information for the memory operations in local storage. A broadcast memory command is then issued to the near-memory compute elements that causes the near-memory compute elements to perform a subsequent memory operation using their respective address information stored in the local storage. This allows a single broadcast memory command to be used to perform memory operations across multiple memory elements, such as DRAM banks, using bank-specific address information. In one implementation, the approach is used to process workloads with irregular updates to memory while consuming less command bus bandwidth than conventional approaches. Implementations include using conditional flags to selectively designate address information in local storage that is to be processed with the broadcast memory command.
-
公开(公告)号:US11656945B2
公开(公告)日:2023-05-23
申请号:US17133843
申请日:2020-12-24
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Nuwan Jayasena , Sudhanva Gurumurthi , Shaizeen Aga , Shrikanth Ganapathy
CPC classification number: G06F11/141 , G06F9/3877
Abstract: Methods and processing devices are provided for error protection to support instruction replay for executing idempotent instructions at a processing in memory PIM device. The processing apparatus includes a PIM device configured to execute an idempotent instruction. The processing apparatus also includes a processor, in communication with the PIM device, configured to issue the idempotent instruction to the PIM device for execution at the PIM device and reissue the idempotent instruction to the PIM device when one of execution of the idempotent instruction at the PIM device results in an error and a predetermined latency period expires from when the idempotent instruction is issued.
-
-
-
-
-
-
-
-
-