DATA CACHE REGION PREFETCHER
    391.
    发明申请

    公开(公告)号:US20220283955A1

    公开(公告)日:2022-09-08

    申请号:US17752244

    申请日:2022-05-24

    Abstract: A method, system, and processing system for pre-fetching data is disclosed. The method, system, and processing system includes data cache region prefetch circuitry for detecting a first access by a first instruction at a first instruction address to a first memory portion, detecting a first non-sequential access pattern to a set of addresses in the first memory portion, and in response to a miss by a second instruction at the first instruction address, and in response to the non-sequential access pattern occurring, pre-fetching data according to the first non-sequential access pattern.

    Offset-aligned three-dimensional integrated circuit

    公开(公告)号:US11437359B2

    公开(公告)日:2022-09-06

    申请号:US16799243

    申请日:2020-02-24

    Abstract: A method for manufacturing a three-dimensional integrated circuit includes attaching a first side of a first die to a first carrier wafer. The method includes preparing a second side of the first die to generate a prepared second side of the first die. The method includes attaching the prepared second side of the first die to a second carrier wafer. The method includes removing the first carrier wafer from the first side of the first die to form a transitional three-dimensional integrated circuit. The method includes attaching a third carrier wafer to a first side of the transitional three-dimensional integrated circuit. The method includes attaching a first side of the second die to a second side of the transitional three-dimensional integrated circuit.

    Proactive management of inter-GPU network links

    公开(公告)号:US11436060B2

    公开(公告)日:2022-09-06

    申请号:US16552065

    申请日:2019-08-27

    Abstract: Systems, apparatuses, and methods for proactively managing inter-processor network links are disclosed. A computing system includes at least a control unit and a plurality of processing units. Each processing unit of the plurality of processing units includes a compute module and a configurable link interface. The control unit dynamically adjusts a clock frequency and a link width of the configurable link interface of each processing unit based on a data transfer size and layer computation time of a plurality of layers of a neural network so as to reduce execution time of each layer. By adjusting the clock frequency and the link width of the link interface on a per-layer basis, the overlapping of communication and computation phases is closely matched, allowing layers to complete more quickly.

    Techniques for improving operand caching

    公开(公告)号:US11436016B2

    公开(公告)日:2022-09-06

    申请号:US16703833

    申请日:2019-12-04

    Abstract: A technique for determining whether a register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache is provided. The technique includes executing an instruction that accesses an operand that comprises the register value, performing one or both of a lookahead technique and a prediction technique to determine whether the register value should be written to an operand cache or whether the register value should remain in and not be evicted from the operand cache, and based on the determining, updating the operand cache.

    Scheduling memory requests for a ganged memory device

    公开(公告)号:US11422707B2

    公开(公告)日:2022-08-23

    申请号:US15851479

    申请日:2017-12-21

    Abstract: Systems, apparatuses, and methods for performing efficient memory accesses for a computing system are disclosed. A computing system includes one or more clients for processing applications. A memory controller transfers traffic between the memory controller and two channels, each connected to a memory device. A client sends a 64-byte memory request with an indication specifying that there are two 32-byte requests targeting non-contiguous data within a same page. The memory controller generates two addresses, and sends a single command and the two addresses to two channels to simultaneously access non-contiguous data in a same page.

    Selectively performing ahead branch prediction based on types of branch instructions

    公开(公告)号:US11416256B2

    公开(公告)日:2022-08-16

    申请号:US16945275

    申请日:2020-07-31

    Abstract: A set of entries in a branch prediction structure for a set of second blocks are accessed based on a first address of a first block. The set of second blocks correspond to outcomes of one or more first branch instructions in the first block. Speculative prediction of outcomes of second branch instructions in the second blocks is initiated based on the entries in the branch prediction structure. State associated with the speculative prediction is selectively flushed based on types of the branch instructions. In some cases, the branch predictor can be accessed using an address of a previous block or a current block. State associated with the speculative prediction is selectively flushed from the ahead branch prediction, and prediction of outcomes of branch instructions in one of the second blocks is selectively initiated using non-ahead accessing, based on the types of the one or more branch instructions.

Patent Agency Ranking