Leveraging processing-in-memory (PIM) resources to expedite non-PIM instructions executed on a host

    公开(公告)号:US11921634B2

    公开(公告)日:2024-03-05

    申请号:US17564155

    申请日:2021-12-28

    CPC classification number: G06F12/0811

    Abstract: Leveraging processing-in-memory (PIM) resources to expedite non-PIM instructions executed on a host is disclosed. In an implementation, a memory controller identifies a first write instruction to write first data to a first memory location, where the first write instruction is not a processing-in-memory (PIM) instruction. The memory controller then writes the first data to a first PIM register. Opportunistically, the memory controller moves the first data from the first PIM register to the first memory location. In another implementation, a memory controller identifies a first memory location associated with a first read instruction, where the first read instruction is not a processing-in-memory (PIM) instruction. The memory controller identifies that a PIM register is associated with the first memory location. The memory controller then reads, in response to the first read instruction, first data from the PIM register.

    DRAM Row Management for Processing in Memory
    113.
    发明公开

    公开(公告)号:US20240004584A1

    公开(公告)日:2024-01-04

    申请号:US17855109

    申请日:2022-06-30

    CPC classification number: G06F3/0659 G06F3/0653 G06F3/0679 G06F3/0604

    Abstract: In accordance with described techniques for DRAM row management for processing in memory, a plurality of instructions are obtained for execution by a processing in memory component embedded in a dynamic random access memory. An instruction is identified that last accesses a row of the dynamic random access memory, and a subsequent instruction is identified that first accesses an additional row of the dynamic random access memory. A first command is issued to close the row and a second command is issued to open the additional row after the row is last accessed by the instruction.

    MECHANISM FOR REDUCING COHERENCE DIRECTORY CONTROLLER OVERHEAD FOR NEAR-MEMORY COMPUTE ELEMENTS

    公开(公告)号:US20230244496A1

    公开(公告)日:2023-08-03

    申请号:US18132879

    申请日:2023-04-10

    Abstract: A parallel processing (PP) level coherence directory, also referred to as a Processing In-Memory Probe Filter (PimPF), is added to a coherence directory controller. When the coherence directory controller receives a broadcast PIM command from a host, or a PIM command that is directed to multiple memory banks in parallel, the PimPF accelerates processing of the PIM command by maintaining a directory for cache coherence that is separate from existing system level directories in the coherence directory controller. The PimPF maintains a directory according to address signatures that define the memory addresses affected by a broadcast PIM command. Two implementations are described: a lightweight implementation that accelerates PIM loads into registers, and a heavyweight implementation that accelerates both PIM loads into registers and PIM stores into memory.

    Mechanism for reducing coherence directory controller overhead for near-memory compute elements

    公开(公告)号:US11625251B1

    公开(公告)日:2023-04-11

    申请号:US17561112

    申请日:2021-12-23

    Abstract: A parallel processing (PP) level coherence directory, also referred to as a Processing In-Memory Probe Filter (PimPF), is added to a coherence directory controller. When the coherence directory controller receives a broadcast PIM command from a host, or a PIM command that is directed to multiple memory banks in parallel, the PimPF accelerates processing of the PIM command by maintaining a directory for cache coherence that is separate from existing system level directories in the coherence directory controller. The PimPF maintains a directory according to address signatures that define the memory addresses affected by a broadcast PIM command. Two implementations are described: a lightweight implementation that accelerates PIM loads into registers, and a heavyweight implementation that accelerates both PIM loads into registers and PIM stores into memory.

    DYNAMICALLY CONFIGURABLE OVERPROVISIONED MICROPROCESSOR

    公开(公告)号:US20220100563A1

    公开(公告)日:2022-03-31

    申请号:US17037727

    申请日:2020-09-30

    Abstract: A dynamically configurable overprovisioned microprocessor optimally supports a variety of different compute application workloads and with the capability to tradeoff among compute performance, energy consumption, and clock frequency on a per-compute application basis, using general-purpose microprocessor designs. In some embodiments, the overprovisioned microprocessor comprises a physical compute resource and a dynamic configuration logic configured to: detect an activation-warranting operating condition; undarken the physical compute resource responsive to detecting the activation-warranting operating condition; detect a configuration-warranting operating condition; and configure the overprovisioned microprocessor to use the undarkened physical compute resource responsive to detecting the configuration-warranting operating condition.

    Interconnect architecture for three-dimensional processing systems

    公开(公告)号:US10984838B2

    公开(公告)日:2021-04-20

    申请号:US14944099

    申请日:2015-11-17

    Abstract: A processing system includes a plurality of processor cores formed in a first layer of an integrated circuit device and a plurality of partitions of memory formed in one or more second layers of the integrated circuit device. The one or more second layers are deployed in a stacked configuration with the first layer. Each of the partitions is associated with a subset of the processor cores that have overlapping footprints with the partitions. The processing system also includes first memory paths between the processor cores and their corresponding subsets of partitions. The processing system further includes second memory paths between the processor cores and the partitions.

    Mechanism for reducing page migration overhead in memory systems

    公开(公告)号:US10339067B2

    公开(公告)日:2019-07-02

    申请号:US15626623

    申请日:2017-06-19

    Abstract: A technique for use in a memory system includes swapping a first plurality of pages of a first memory of the memory system with a second plurality of pages of a second memory of the memory system. The first memory has a first latency and the second memory has a second latency. The first latency is less than the second latency. The technique includes updating a page table and triggering a translation lookaside buffer shootdown to associate a virtual address of each of the first plurality of pages with a corresponding physical address in the second memory and to associate a virtual address for each of the second plurality of pages with a corresponding physical address in the first memory.

Patent Agency Ranking