DECOMPOSING MATRICES FOR PROCESSING AT A PROCESSOR-IN-MEMORY

    公开(公告)号:US20230102296A1

    公开(公告)日:2023-03-30

    申请号:US17490037

    申请日:2021-09-30

    Abstract: A processing unit decomposes a matrix for partial processing at a processor-in-memory (PIM) device. The processing unit receives a matrix to be used as an operand in an arithmetic operation (e.g., a matrix multiplication operation). In response, the processing unit decomposes the matrix into two component matrices: a sparse component matrix and a dense component matrix. The processing unit itself performs the arithmetic operation with the dense component matrix, but sends the sparse component matrix to the PIM device for execution of the arithmetic operation. The processing unit thereby offloads at least some of the processing overhead to the PIM device, improving overall efficiency of the processing system.

    GLASS CORE PACKAGE SUBSTRATES
    322.
    发明申请

    公开(公告)号:US20230102183A1

    公开(公告)日:2023-03-30

    申请号:US17489182

    申请日:2021-09-29

    Abstract: Apparatuses, systems and methods for efficiently generating a package substrate. A semiconductor fabrication process (or process) fabricates each of a first glass package substrate and a second glass package substrate with a redistribution layer on a single side of a respective glass wafer. The process flips the second glass package substrate upside down and connects the glass wafers of the first and second glass package substrates together using a wafer bonding technique. In some implementations, the process uses copper-based wafer bonding. The resulting bonding between the two glass wafers contains no air gap, no underfill, and no solder bumps. Afterward, the side of the first glass package substrate opposite the glass wafer is connected to at least one integrated circuit. Additionally, the side of the second glass package substrate opposite the glass wafer is connected to a component on the motherboard through pads on the motherboard.

    DETERMINISTIC MIXED LATENCY CACHE
    324.
    发明申请

    公开(公告)号:US20230101038A1

    公开(公告)日:2023-03-30

    申请号:US17489741

    申请日:2021-09-29

    Abstract: A method and processing device for accessing data is provided. The processing device comprises a cache and a processor. The cache comprises a first data section having a first cache hit latency and a second data section having a second cache hit latency that is different from the first cache hit latency of the first data section. The processor is configured to request access to data in memory, the data corresponding to a memory address which includes an identifier that identifies the first data section of the cache. The processor is also configured to load the requested data, determined to be located in the first data section of the cache, according to the first cache hit latency of the first data section of the cache.

    SOCKET ACTUATION MECHANISM FOR PACKAGE INSERTION AND PACKAGE-SOCKET ALIGNMENT

    公开(公告)号:US20230100491A1

    公开(公告)日:2023-03-30

    申请号:US17487929

    申请日:2021-09-28

    Abstract: A socket actuation mechanism for package insertion and package-socket alignment, including: a socket frame comprising a plurality of first hinge portions; a carrier frame comprising: a center portion comprising one or more package interlocks; and a tab extending from a first end of the carrier frame, the tab comprising a second hinge portion couplable with the plurality of first hinge portions to form a hinge coupling the carrier frame to the socket frame.

    USING REQUEST CLASS AND REUSE RECORDING IN ONE CACHE FOR INSERTION POLICIES OF ANOTHER CACHE

    公开(公告)号:US20230100230A1

    公开(公告)日:2023-03-30

    申请号:US17488206

    申请日:2021-09-28

    Inventor: Paul J. Moyer

    Abstract: Systems and methods are disclosed for maintaining insertion policies of a lower-level cache. Techniques are described for selecting, based on metadata of an evicted data block received from an upper-level cache, an insertion policy out of the insertion policies. Then, determining, based on the selected insertion policy, whether to insert the data block into the lower-level cache. If it is determined to insert, the data block is inserted into the lower-level cache according to the selected insertion policy. Techniques for dynamically updating the insertion policies of the lower-level cache are also disclosed.

    SEPARATE CLOCKING FOR COMPONENTS OF A GRAPHICS PROCESSING UNIT

    公开(公告)号:US20230096002A1

    公开(公告)日:2023-03-30

    申请号:US17890520

    申请日:2022-08-18

    Abstract: Systems and methods related to controlling clock signals for clocking shader engines modules (SEs) and non-shader-engine modules (nSEs) of a graphics processing unit (GPU) are provided. One or more dividers receive a clock signal CLK and output a clock signal CLKA to the SEs and output a clock signal CLKB to the nSEs. The frequencies of CLKA and CLKB are independently selected based on sets of performance counter data monitored at the SEs and nSEs, respectively. The clock signal frequency for either the SEs or the nSEs is reduced when the corresponding sets of performance counter data indicates a comparatively lower processing workload for the SEs or for the nSEs.

    DYNAMIC ALLOCATION OF PLATFORM RESOURCES

    公开(公告)号:US20230094384A1

    公开(公告)日:2023-03-30

    申请号:US17487103

    申请日:2021-09-28

    Abstract: A dynamic allocator for providing platform resource candidates is disclosed. In an implementation, a platform resource allocator receives a request from a workload initiator such as, an application, for a platform resource recommendation. The platform resource allocator analyzes performance capabilities and utilization metrics of a plurality of platform resources for each of a plurality of resource. The plurality of platform resources includes one or more graphics processor units (GPUs) and one or more accelerated processing units (APUs). The platform resource allocator dynamically provides the platform resource recommendation to the workload initiator to select one or more of the plurality of platform resources to execute a workload based on the performance capabilities and utilization metrics.

    STANDARD CELL DESIGN ARCHITECTURE FOR REDUCED VOLTAGE DROOP UTILIZING REDUCED CONTACTED GATE POLY PITCH AND DUAL HEIGHT CELLS

    公开(公告)号:US20230092184A1

    公开(公告)日:2023-03-23

    申请号:US17483672

    申请日:2021-09-23

    Abstract: A system and method for creating chip layout are described. In various implementations, a standard cell uses unidirectional tracks for power connections and signal routing. A single track of the metal one layer that uses a minimum width of the metal one layer is placed within a pitch of a single metal gate. The single track of the metal one layer provides a power supply reference voltage level or ground reference voltage level. This placement of the single track provides a metal one power post contacted gate pitch (CPP) of 1 CPP. To further reduce voltage droop, a standard cell uses dual height and half the width of a single height cell along with placing power posts with 1 CPP. The placement of the multiple power rails of the dual height cell allows alignment of the power rails with power rails of other standard cells.

Patent Agency Ranking