Flexible dictionary sharing for compressed caches

    公开(公告)号:US11586555B2

    公开(公告)日:2023-02-21

    申请号:US17231957

    申请日:2021-04-15

    Abstract: Systems, apparatuses, and methods for implementing flexible dictionary sharing techniques for caches are disclosed. A set-associative cache includes a dictionary for each data array set. When a cache line is to be allocated in the cache, a cache controller determines to which set a base index of the cache line address maps. Then, a selector unit determines which dictionary of a group of dictionaries stored by those sets neighboring this set would achieve the most compression for the cache line. This dictionary is then selected to compress the cache line. An offset is added to the base index of the cache line to generate a full index in order to map the cache line to the set corresponding to this chosen dictionary. The compressed cache line is stored in this set with the chosen dictionary, and the offset is stored in the corresponding tag array entry.

    NOISE MITIGATION IN SINGLE ENDED LINKS

    公开(公告)号:US20230046477A1

    公开(公告)日:2023-02-16

    申请号:US17545108

    申请日:2021-12-08

    Abstract: A data transmission system includes a first circuit, a second circuit, and a reference voltage generation circuit. The first circuit includes a transmitter powered by a first power supply voltage and having an input for receiving a data output signal, and an output. The second circuit includes a receiver powered by a second power supply voltage and having a first input coupled to the output of the transmitter, a second input for receiving a reference voltage, and an output for providing a data input signal. The reference voltage generation circuit forms the reference voltage by mixing a first signal generated by the first circuit based on the first power supply voltage and a second signal generated by the second circuit based on the second power supply voltage.

    Fused convolution and batch normalization for neural networks

    公开(公告)号:US11573765B2

    公开(公告)日:2023-02-07

    申请号:US16219154

    申请日:2018-12-13

    Abstract: A processing unit implements a convolutional neural network (CNN) by fusing at least a portion of a convolution phase of the CNN with at least a portion of a batch normalization phase. The processing unit convolves two input matrices representing inputs and weights of a portion of the CNN to generate an output matrix. The processing unit performs the convolution via a series of multiplication operations, with each multiplication operation generating a corresponding submatrix (or “tile”) of the output matrix at an output register of the processing unit. While an output submatrix is stored at the output register, the processing unit performs a reduction phase and an update phase of the batch normalization phase for the CNN. The processing unit thus fuses at least a portion of the batch normalization phase of the CNN with a portion of the convolution.

    DISPATCH BANDWIDTH OF MEMORY-CENTRIC REQUESTS BY BYPASSING STORAGE ARRAY ADDRESS CHECKING

    公开(公告)号:US20230030679A1

    公开(公告)日:2023-02-02

    申请号:US17386115

    申请日:2021-07-27

    Abstract: A technical solution to the technical problem of how to improve dispatch throughput for memory-centric commands bypasses address checking for certain memory-centric commands. Implementations include using an Address Check Bypass (ACB) bit to specify whether address checking should be performed for a memory-centric command. ACB bit values are specified in memory-centric instructions, automatically specified by a process, such as a compiler, or by host hardware, such as dispatch hardware, based upon whether a memory-centric command explicitly references memory. Implementations include bypassing, i.e., not performing, address checking for memory-centric commands that do not access memory and also for memory-centric commands that do access memory, but that have the same physical address as a prior memory-centric command that explicitly accessed memory to ensure that any data in caches was flushed to memory and/or invalidated.

Patent Agency Ranking