MEMORY ACCESS COMMANDS WITH NEAR-MEMORY ADDRESS GENERATION

    公开(公告)号:US20210374055A1

    公开(公告)日:2021-12-02

    申请号:US16887713

    申请日:2020-05-29

    Abstract: A memory controller may be configured with command logic that is capable of sending a memory access command having incomplete address information via a command/address bus that connects the memory controller to memory modules. The memory controller may send the memory access command via the bus for accessing data stored at memory locations of the memory modules. The memory locations may correspond to different near-memory generated reflecting that the data is not address aligned across the memory modules. Nonetheless, because of the near-memory address generation, the memory controller can send the memory access command having incomplete address information for accessing the data stored at the different addresses, as opposed to having to send multiple memory access commands specifying complete address information on the bus for accessing the data at the different addresses, thereby conserving usage of the available bus bandwidth, reducing power consumption, and increasing compute throughput.

    DEVICE AND METHOD FOR ACCELERATING MATRIX MULTIPLY OPERATIONS

    公开(公告)号:US20210209192A1

    公开(公告)日:2021-07-08

    申请号:US17208526

    申请日:2021-03-22

    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.

    Near-memory data-dependent gather and packing

    公开(公告)号:US10782918B2

    公开(公告)日:2020-09-22

    申请号:US16123837

    申请日:2018-09-06

    Abstract: Methods, systems, and devices for near-memory data-dependent gathering and packing of data stored in a memory. A processing device extracts a function, a memory source address, and a memory destination address from a near-memory data-dependent gathering and packing primitive. A signal to perform gathering and packing operations based on the primitive is sent to near-memory processing circuitry of a memory device. The near-memory processing circuitry receives the signal, gathers data from the memory device based on the function and the memory source address, and packs the gathered data into the memory device based on the memory destination address.

    Near-memory determination of registers

    公开(公告)号:US11966328B2

    公开(公告)日:2024-04-23

    申请号:US17126977

    申请日:2020-12-18

    CPC classification number: G06F12/06 G06F2212/1041

    Abstract: A memory module includes register selection logic to select alternate local source and/or destination registers to process PIM commands. The register selection logic uses an address-based register selection approach to select an alternate local source and/or destination register based upon address data specified by a PIM command and a split address maintained by a memory module. The register selection logic may alternatively use a register data-based approach to select an alternate local source and/or destination register based upon data stored in one or more local registers. A PIM-enabled memory module configured with the register selection logic described herein is capable of selecting an alternate local source and/or destination register to process PIM commands at or near the PIM execution unit where the PIM commands are executed.

    Approach for reducing side effects of computation offload to memory

    公开(公告)号:US11847055B2

    公开(公告)日:2023-12-19

    申请号:US17364854

    申请日:2021-06-30

    CPC classification number: G06F12/084 G06F12/0238 G06F12/0811 G06F12/0862

    Abstract: A technical solution to the technical problem of how to reduce the undesirable side effects of offloading computations to memory uses read hints to preload results of memory-side processing into a processor-side cache. A cache controller, in response to identifying a read hint in a memory-side processing instruction, causes results of the memory-side processing to be preloaded into a processor-side cache. Implementations include, without limitation, enabling or disabling the preloading based upon cache thrashing levels, preloading results, or portions of results, of memory-side processing to particular destination caches, preloading results based upon priority and/or degree of confidence, and/or during periods of low data bus and/or command bus utilization, last stores considerations, and enforcing an ordering constraint to ensure that preloading occurs after memory-side processing results are complete.

    DEVICE AND METHOD FOR ACCELERATING MATRIX MULTIPLY OPERATIONS

    公开(公告)号:US20230244751A1

    公开(公告)日:2023-08-03

    申请号:US18297230

    申请日:2023-04-07

    CPC classification number: G06F17/16 G06F7/5324 G06F15/8007

    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.

    APPROACH FOR PERFORMING EFFICIENT MEMORY OPERATIONS USING NEAR-MEMORY COMPUTE ELEMENTS

    公开(公告)号:US20230195618A1

    公开(公告)日:2023-06-22

    申请号:US17557568

    申请日:2021-12-21

    CPC classification number: G06F12/06

    Abstract: Near-memory compute elements perform memory operations and temporarily store at least a portion of address information for the memory operations in local storage. A broadcast memory command is then issued to the near-memory compute elements that causes the near-memory compute elements to perform a subsequent memory operation using their respective address information stored in the local storage. This allows a single broadcast memory command to be used to perform memory operations across multiple memory elements, such as DRAM banks, using bank-specific address information. In one implementation, the approach is used to process workloads with irregular updates to memory while consuming less command bus bandwidth than conventional approaches. Implementations include using conditional flags to selectively designate address information in local storage that is to be processed with the broadcast memory command.

Patent Agency Ranking