MEMORY CONTROLLER ARCHITECTURE
    12.
    发明公开

    公开(公告)号:US20240004799A1

    公开(公告)日:2024-01-04

    申请号:US18202802

    申请日:2023-05-26

    CPC classification number: G06F12/0897 G06F11/1064 G06F2212/1032

    Abstract: An apparatus can include a plurality of memory devices and a memory controller coupled to the plurality of memory devices via a plurality of memory channels. The plurality of memory channels are organized as a plurality of channel groups. The memory controller comprises a plurality of memory access request/response buffer sets, and each memory access request/response buffer set of the plurality of memory access request/response buffer sets corresponds to a different one of the plurality of channel groups.

    Mechanism to trigger early termination of cooperating processes

    公开(公告)号:US11789790B2

    公开(公告)日:2023-10-17

    申请号:US17984817

    申请日:2022-11-10

    CPC classification number: G06F9/542 G06F9/3009 G06F9/546

    Abstract: Devices and techniques for triggering early termination of cooperating processes in a processor are described herein. A system includes multiple memory-compute nodes, wherein a memory-compute node comprises: event manager circuitry configured to establish a broadcast channel to receive event messages; and thread manager circuitry configured to organize a plurality of threads to perform portions of a cooperative task, wherein the plurality of threads each monitor the broadcast channel to receive event messages on the broadcast channel, and wherein upon achieving a threshold operation, the thread manager circuitry is to use the event manager circuitry to broadcast, on the broadcast channel, an event message indicating that the cooperative task is complete, causing other threads, in response to receiving the event message, to terminate execution of their respective portions of the cooperative task.

    DEBUGGING DATAFLOW COMPUTER ARCHITECTURES

    公开(公告)号:US20230079727A1

    公开(公告)日:2023-03-16

    申请号:US17991390

    申请日:2022-11-21

    Abstract: Disclosed in some examples are methods, systems, devices, and machine-readable mediums that use parallel hardware execution with software co-simulation to enable more advanced debugging operations on data flow architectures. Upon a halt to execution of a program thread, a state of the tiles that are executing the thread are saved and offloaded from the HTF to a host system. A developer may then examine this state on the host system to debug their program. Additionally, the state may be loaded into a software simulator that simulates the HTF hardware. This simulator allows for the developer to step through the code and to examine values to find bugs.

    Debugging dataflow computer architectures

    公开(公告)号:US11507493B1

    公开(公告)日:2022-11-22

    申请号:US17405211

    申请日:2021-08-18

    Abstract: Disclosed in some examples are methods, systems, devices, and machine-readable mediums that use parallel hardware execution with software co-simulation to enable more advanced debugging operations on data flow architectures. Upon a halt to execution of a program thread, a state of the tiles that are executing the thread are saved and offloaded from the HTF to a host system. A developer may then examine this state on the host system to debug their program. Additionally, the state may be loaded into a software simulator that simulates the HTF hardware. This simulator allows for the developer to step through the code and to examine values to find bugs.

    TECHNIQUES FOR DATA TRANSFER BETWEEN TIERED MEMORY DEVICES

    公开(公告)号:US20250036284A1

    公开(公告)日:2025-01-30

    申请号:US18774784

    申请日:2024-07-16

    Abstract: Methods, systems, and devices for techniques for data transfer between tiered memory devices are described. A memory system may include a data transfer engine to manage data transfers between different tiers of memory devices within the memory system. The data transfer engine may receive a command which includes a set of source addresses of each of a set of data sets and a set of destination addresses to which the data sets are to be transferred. The data transfer engine may schedule and perform a transfer operation to transfer each of the set of data sets from the respective source address to the respective destination address. The command may further include an indication of an interrupt policy of a set of interrupt policies supported by the data transfer engine. The set of interrupt policies may determine how the data transfer engine may handle interruptions to the data transfer operation.

    SYSTEMS AND METHODS FOR PARALLELIZING LOOPS THAT HAVE LOOP-DEPENDENT VARIABLES

    公开(公告)号:US20250021317A1

    公开(公告)日:2025-01-16

    申请号:US18768477

    申请日:2024-07-10

    Abstract: Devices and techniques for parallelizing loops that have loop-dependent variables are described herein. A system includes a processing device; and a memory device configured to store instructions, which when executed by the processing device, cause the processing device to perform operations comprising: accessing, by a compiler executing on a processing device, a computer code listing; determining that the computer code listing includes a loop with a loop-carried dependency variable; optimizing the loop for parallel execution by removing the loop-carried dependency variable; and compiling the computer code listing into executable software code with the loop executable in parallel in hardware.

    Chained resource locking
    18.
    发明授权

    公开(公告)号:US12182635B2

    公开(公告)日:2024-12-31

    申请号:US17405457

    申请日:2021-08-18

    Abstract: Devices and techniques for CHAINED RESOURCE LOCKING are described herein. Threads form a last-in-first-out (LIFO) queue on a resource lock to create a chained lock on the resource. A data store representing the lock for the resource holds the previous thread's identifier, enabling a subsequent thread to wake the previous thread using the identifier when the subsequent thread releases the lock. Generally, the thread releasing the lock need not interact with the data store, reducing contention for the data store among many threads.

    Global virtual address space across operating system domains

    公开(公告)号:US12141055B2

    公开(公告)日:2024-11-12

    申请号:US17900400

    申请日:2022-08-31

    Abstract: Disclosed in some examples, are methods, systems, devices, and machine-readable mediums which solve the above problems using a global shared region of memory that combines memory segments from multiple CXL devices. Each memory segment is a same size and naturally aligned in its own physical address space. The global shared region is contiguous and naturally aligned in the virtual address space. By organizing this global shared region in this manner, a series of three tables may be used to quickly translate a virtual address in the global shared region to a physical address. This prevents TLB thrashing and improves performance of the computing system.

Patent Agency Ranking