Multi-kernel wavefront scheduler
    71.
    发明授权

    公开(公告)号:US12099867B2

    公开(公告)日:2024-09-24

    申请号:US15993061

    申请日:2018-05-30

    CPC classification number: G06F9/4881

    Abstract: Systems, apparatuses, and methods for implementing a multi-kernel wavefront scheduler are disclosed. A system includes at least a parallel processor coupled to one or more memories, wherein the parallel processor includes a command processor and a plurality of compute units. The command processor launches multiple kernels for execution on the compute units. Each compute unit includes a multi-level scheduler for scheduling wavefronts from multiple kernels for execution on its execution units. A first level scheduler creates scheduling groups by grouping together wavefronts based on the priority of their kernels. Accordingly, wavefronts from kernels with the same priority are grouped together in the same scheduling group by the first level scheduler. Next, the first level scheduler selects, from a plurality of scheduling groups, the highest priority scheduling group for execution. Then, a second level scheduler schedules wavefronts for execution from the scheduling group selected by the first level scheduler.

    SOFTWARE-DEFINED COMPUTE UNIT RESOURCE ALLOCATION MODE

    公开(公告)号:US20240311199A1

    公开(公告)日:2024-09-19

    申请号:US18120646

    申请日:2023-03-13

    CPC classification number: G06F9/505 G06F9/522

    Abstract: A program code executing on a processing system includes one or more instructions each identifying a workload that includes a plurality of waves and each identifying resource allocations for the plurality of waves of the workgroup. In response to receiving an instruction identifying a workload and resource allocations for the plurality of waves of the workgroup, a processor allocates a first set of processing resources to a compute unit of the processor based on the resource allocations for the plurality of waves. The compute unit then performs operations for the workgroup using the allocated set of processing resources.

    Glass core package substrates
    78.
    发明授权

    公开(公告)号:US12080632B2

    公开(公告)日:2024-09-03

    申请号:US17489182

    申请日:2021-09-29

    CPC classification number: H01L23/4951 H01L23/145 H10B12/50

    Abstract: Apparatuses, systems and methods for efficiently generating a package substrate. A semiconductor fabrication process (or process) fabricates each of a first glass package substrate and a second glass package substrate with a redistribution layer on a single side of a respective glass wafer. The process flips the second glass package substrate upside down and connects the glass wafers of the first and second glass package substrates together using a wafer bonding technique. In some implementations, the process uses copper-based wafer bonding. The resulting bonding between the two glass wafers contains no air gap, no underfill, and no solder bumps. Afterward, the side of the first glass package substrate opposite the glass wafer is connected to at least one integrated circuit. Additionally, the side of the second glass package substrate opposite the glass wafer is connected to a component on the motherboard through pads on the motherboard.

    Dual read port latch array bitcell
    79.
    发明授权

    公开(公告)号:US12073919B2

    公开(公告)日:2024-08-27

    申请号:US17359445

    申请日:2021-06-25

    CPC classification number: G11C8/16 G06F30/392 G11C11/418 G11C11/419

    Abstract: An apparatus and method for providing efficient floor planning, power, and performance tradeoffs of memory accesses. A dual read port and single write port memory bit cell uses two asymmetrical read access circuits for conveying stored data on two read bit lines. The two read bit lines are pre-charged to different voltage reference levels. The layout of the memory bit cell places the two read bit lines on an opposed edge from the single write bit line. The layout uses a dummy gate placed over both p-type diffusion and n-type diffusion between the edges. The layout has a same number of p-type transistors as n-type transistors despite using asymmetrical read access circuits. The layout also has a contacted gate pitch that is one more than the number of p-type transistors.

    Stacked command queue
    80.
    发明授权

    公开(公告)号:US12073114B2

    公开(公告)日:2024-08-27

    申请号:US17491058

    申请日:2021-09-30

    Abstract: A memory controller includes a command queue with multiple entry stacks, each with a plurality of entries holding memory access commands, one or more parameter indicators each holding a respective characteristic common to the plurality of entries, and a head indicator designating a current entry for arbitration. An arbiter has a single command input for each entry stack. A command queue loader circuit receives incoming memory access commands and loads entries of respective entry stacks with memory access commands having the respective characteristic of each of the one or more parameter indicators in common.

Patent Agency Ranking