LATENCY HIDING FOR CACHES
    521.
    发明申请

    公开(公告)号:US20210141740A1

    公开(公告)日:2021-05-13

    申请号:US16683142

    申请日:2019-11-13

    Abstract: A technique for accessing a memory having a high latency portion and a low latency portion is provided. The technique includes detecting a promotion trigger to promote data from the high latency portion to the low latency portion, in response to the promotion trigger, copying cache lines associated with the promotion trigger from the high latency portion to the low latency portion, and in response to a read request, providing data from either or both of the high latency portion or the low latency portion, based on a state associated with data in the high latency portion and the low latency portion.

    DYNAMIC BANKING AND BIT SEPARATION IN MEMORIES

    公开(公告)号:US20210141733A1

    公开(公告)日:2021-05-13

    申请号:US16680491

    申请日:2019-11-11

    Abstract: Memories that are configurable to operate in either a banked mode or a bit-separated mode. The memories include a plurality of memory banks; multiplexing circuitry; input circuitry; and output circuitry. The input circuitry inputs at least a portion of a memory address and configuration information to the multiplexing circuitry. The multiplexing circuitry generates read data by combining a selected subset of data corresponding to the address from each of the plurality of memory banks, the subset selected based on the configuration information, if the configuration information indicates a bit-separated mode. The multiplexing circuitry generates the read data by combining data corresponding to the address from one of the memory banks, the one of the memory banks selected based on the configuration information, if the configuration information indicates a banked mode. The output circuitry outputs the generated read data from the memory.

    Semiconductor chip with stacked conductor lines and air gaps

    公开(公告)号:US11004791B2

    公开(公告)日:2021-05-11

    申请号:US16382774

    申请日:2019-04-12

    Inventor: Richard Schultz

    Abstract: Various semiconductor chip metallization layers and methods of manufacturing the same are disclosed. In aspect, a semiconductor chip is provided that includes a substrate, plural metallization layers on the substrate, a first conductor line in one of the metallization layers and a second conductor line in the one of the metallization layers in spaced apart relation to the first conductor line, each of the first conductor line and the second conductor line has a first line portion and a second line portion stacked on the first line portion, and a dielectric layer that has a portion positioned between the first conductor line and a second line, the portion has an air gap.

    SHADOW LATCHES IN A SHADOW-LATCH CONFIGURED REGISTER FILE FOR THREAD STORAGE

    公开(公告)号:US20210132985A1

    公开(公告)日:2021-05-06

    申请号:US16668469

    申请日:2019-10-30

    Abstract: A processing system includes a processor core and a scheduler coupled to the processor core. The processing system executes a first active thread and a second active thread in the processor core and detects a swap event for the first active thread or the second active thread. Based on the swap event, using a shadow-latch configured fixed mapping system, to the processing system replaces either the first active thread or the second active thread with a shadow-based thread, the shadow-based thread being stored in a shadow-latch configured register file.

    Low latency FIFO with auto sync
    525.
    发明授权

    公开(公告)号:US10990120B2

    公开(公告)日:2021-04-27

    申请号:US16452869

    申请日:2019-06-26

    Abstract: A method operates a first-in-first-out (FIFO) buffer with a first clock, and operates one of a read pointer or a write pointer of the FIFO buffer with the first clock while operating the other one of the read pointer or write pointer with a second clock. One of a serializer fed from the FIFO buffer output, or a de-serializer feeding the FIFO buffer input, is operated with the second clock. Timing pulses indicate that the pointer operating with the second clock has reached a predetermined point in its cycle. The phase of the second clock is adjusted based on a relationship between the timing pulses and an advance period of the pointer operating with the first clock. The pointer operating with the first clock is reset to achieve a desired value for the relationship. A skew created from adjusting the phase of the second clock is corrected.

    COMPOSABLE NEURAL NETWORK KERNELS
    526.
    发明申请

    公开(公告)号:US20210117806A1

    公开(公告)日:2021-04-22

    申请号:US17138709

    申请日:2020-12-30

    Abstract: A technique for manipulating a generic tensor is provided. The technique includes receiving a first request to perform a first operation on a generic tensor descriptor associated with the generic tensor, responsive to the first request, performing the first operation on the generic tensor descriptor, receiving a second request to perform a second operation on generic tensor raw data associated with the generic tensor, and responsive to the second request, performing the second operation on the generic tensor raw data, the performing the second operation including mapping a tensor coordinate specified by the second request to a memory address, the mapping including evaluating a delta function to determine an address delta value to add to a previously determined address for a previously processed tensor coordinate.

    REGISTER RENAMING AFTER A NON-PICKABLE SCHEDULER QUEUE

    公开(公告)号:US20210117196A1

    公开(公告)日:2021-04-22

    申请号:US16660495

    申请日:2019-10-22

    Abstract: A floating point unit includes a non-pickable scheduler queue (NSQ) that offers a load operation concurrently with a load store unit retrieving load data for an operand that is to be loaded by the load operation. The floating point unit also includes a renamer that renames architectural registers used by the load operation and allocates physical register numbers to the load operation in response to receiving the load operation from the NSQ. The floating point unit further includes a set of pickable scheduler queues that receive the load operation from the renamer and store the load operation prior to execution. A physical register file is implemented in the floating point unit and a free list is used to store physical register numbers of entries in the physical register file that are available for allocation.

    Method and system for opportunistic load balancing in neural networks using metadata

    公开(公告)号:US10970120B2

    公开(公告)日:2021-04-06

    申请号:US16019374

    申请日:2018-06-26

    Abstract: Methods and systems for opportunistic load balancing in deep neural networks (DNNs) using metadata. Representative computational costs are captured, obtained or determined for a given architectural, functional or computational aspect of a DNN system. The representative computational costs are implemented as metadata for the given architectural, functional or computational aspect of the DNN system. In an implementation, the computed computational cost is implemented as the metadata. A scheduler detects whether there are neurons in subsequent layers that are ready to execute. The scheduler uses the metadata and neuron availability to schedule and load balance across compute resources and available resources.

    FABRICATING ACTIVE-BRIDGE-COUPLED GPU CHIPLETS

    公开(公告)号:US20210098419A1

    公开(公告)日:2021-04-01

    申请号:US16585480

    申请日:2019-09-27

    Abstract: Various multi-die arrangements and methods of manufacturing the same are disclosed. In some embodiments, a method of manufacture includes a face-to-face process in which a first GPU chiplet and a second GPU chiplet are bonded to a temporary carrier wafer. A face surface of an active bridge chiplet is bonded to a face surface of the first and second GPU chiplets before mounting the GPU chiplets to a carrier substrate. In other embodiments, a method of manufacture includes a face-to-back process in which a face surface of an active bridge chiplet is bonded to a back surface of the first and second GPU chiplets.

Patent Agency Ranking