APPROACH FOR SUPPORTING MEMORY-CENTRIC OPERATIONS ON CACHED DATA

    公开(公告)号:US20230021492A1

    公开(公告)日:2023-01-26

    申请号:US17385783

    申请日:2021-07-26

    Abstract: A technical solution to the technical problem of how to support memory-centric operations on cached data uses a novel memory-centric memory operation that invokes write back functionality on cache controllers and memory controllers. The write back functionality enforces selective flushing of dirty, i.e., modified, cached data that is needed for memory-centric memory operations from caches to the completion level of the memory-centric memory operations, and updates the coherence state appropriately at each cache level. The technical solution ensures that commands to implement the selective cache flushing are ordered before the memory-centric memory operation at the completion level of the memory-centric memory operation.

    METHOD AND APPARATUS FOR PROVIDING PERSISTENCE TO REMOTE NON-VOLATILE MEMORY

    公开(公告)号:US20220091974A1

    公开(公告)日:2022-03-24

    申请号:US17031518

    申请日:2020-09-24

    Abstract: A processing device and methods of controlling remote persistent writes are provided. Methods include receiving an instruction of a program to issue a persistent write to remote memory. The methods also include logging an entry in a local domain when the persistent write instruction is received and providing a first indication that the persistent write will be persisted to the remote memory. The methods also include executing the persistent write to the remote memory and providing a second indication that the persistent write to the remote memory is completed. The methods also include providing the first and second indications when it is determined not to execute the persistent write according to global ordering and providing the second indication without providing the first indication when it is determined to execute the persistent write to remote memory according to global ordering.

    Command throughput in PIM-enabled memory using available data bus bandwidth

    公开(公告)号:US11262949B2

    公开(公告)日:2022-03-01

    申请号:US16885677

    申请日:2020-05-28

    Abstract: An approach is provided for reducing command bus traffic between memory controllers and PIM-enabled memory modules using special PIM commands. The term “special PIM command” is used herein to describe embodiments and refers to a PIM command for which the corresponding module-specific command information is provided to memory modules via a non-command bus data path. A memory controller generates and issues a special PIM command to multiple PIM-enabled memory modules via a command bus and provides module-specific command information (e.g., address information) for the special PIM command to the PIM-enabled memory modules via the non-command bus data path that is shared by the PIM-enabled memory modules and the memory controller.

    NEAR-MEMORY DATA REDUCTION
    14.
    发明申请

    公开(公告)号:US20210117133A1

    公开(公告)日:2021-04-22

    申请号:US16658733

    申请日:2019-10-21

    Abstract: An approach is provided for implementing near-memory data reduction during store operations to off-chip or off-die memory. A Near-Memory Reduction (NMR) unit provides near-memory data reduction during write operations to a specified address range. The NMR unit is configured with a range of addresses to be reduced and when a store operation specifies an address within the range of addresses, the NRM unit performs data reduction by adding the data value specified by the store operation to an accumulated reduction result. According to an embodiment, the NRM unit maintains a count of the number of updates to the accumulated reduction result that are used to determine when data reduction has been completed.

    Address mapping-aware tasking mechanism

    公开(公告)号:US12099866B2

    公开(公告)日:2024-09-24

    申请号:US17135381

    申请日:2020-12-28

    Abstract: An Address Mapping-Aware Tasking (AMAT) mechanism manages compute task data and issues compute tasks on behalf of threads that created the compute task data. The AMAT mechanism stores compute task data generated by host threads in a set of partitions, where each partition is designated for a particular memory module. The AMAT mechanism maintains address mapping data that maps address information to partitions. Threads push compute task data to the AMAT mechanism instead of generating and issuing their own compute tasks. The AMAT mechanism uses address information included in the compute task data and the address mapping data to determine partitions in which to store the compute task data. The AMAT mechanism then issues compute tasks to be executed near the corresponding memory modules (i.e., in PIM execution units or NUMA compute nodes) based upon the compute task data stored in the partitions.

    DYNAMIC CONTROL OF WORK SCHEDULING
    16.
    发明公开

    公开(公告)号:US20240220315A1

    公开(公告)日:2024-07-04

    申请号:US18091443

    申请日:2022-12-30

    CPC classification number: G06F9/4881 G06F9/52

    Abstract: A processing system includes a scheduling mechanism for producing data for fine-grained reordering of workgroups of a kernel to produce blocks of data, such as for communication across devices to enable overlapping of a producer computation with an all-reduce communication across the network. This scheduling mechanism enables a first parallel processor to schedule and execute a set of workgroups of a producer operation to generate data for transmission to a second parallel processor in a desired traffic pattern. At the same time, the second parallel processor schedules and executes a different set of workgroups of the producer operation to generate data for transmission in a desired traffic pattern to a third parallel processor or back to the first parallel processor.

    NEAR-MEMORY DETERMINATION OF REGISTERS

    公开(公告)号:US20220197647A1

    公开(公告)日:2022-06-23

    申请号:US17126977

    申请日:2020-12-18

    Abstract: A memory module includes register selection logic to select alternate local source and/or destination registers to process PIM commands. The register selection logic uses an address-based register selection approach to select an alternate local source and/or destination register based upon address data specified by a PIM command and a split address maintained by a memory module. The register selection logic may alternatively use a register data-based approach to select an alternate local source and/or destination register based upon data stored in one or more local registers. A PIM-enabled memory module configured with the register selection logic described herein is capable of selecting an alternate local source and/or destination register to process PIM commands at or near the PIM execution unit where the PIM commands are executed.

    Memory access commands with near-memory address generation

    公开(公告)号:US11216373B2

    公开(公告)日:2022-01-04

    申请号:US16887713

    申请日:2020-05-29

    Abstract: A memory controller may be configured with command logic that is capable of sending a memory access command having incomplete address information via a command/address bus that connects the memory controller to memory modules. The memory controller may send the memory access command via the bus for accessing data stored at memory locations of the memory modules. The memory locations may correspond to different near-memory generated reflecting that the data is not address aligned across the memory modules. Nonetheless, because of the near-memory address generation, the memory controller can send the memory access command having incomplete address information for accessing the data stored at the different addresses, as opposed to having to send multiple memory access commands specifying complete address information on the bus for accessing the data at the different addresses, thereby conserving usage of the available bus bandwidth, reducing power consumption, and increasing compute throughput.

Patent Agency Ranking