TECHNIQUES FOR DECOUPLED ACCESS-EXECUTE NEAR-MEMORY PROCESSING

    公开(公告)号:US20200026513A1

    公开(公告)日:2020-01-23

    申请号:US16585521

    申请日:2019-09-27

    Abstract: Techniques for decoupled access-execute near-memory processing include examples of first or second circuitry of a near-memory processor receiving instructions that cause the first circuitry to implement system memory access operations to access one or more data chunks and the second circuitry to implement compute operations using the one or more data chunks.

    METHOD, SYSTEM, AND DEVICE FOR NEAR-MEMORY PROCESSING WITH CORES OF A PLURALITY OF SIZES

    公开(公告)号:US20190041952A1

    公开(公告)日:2019-02-07

    申请号:US16107215

    申请日:2018-08-21

    Abstract: A device is configured to be in communication with one or more host cores via a first communication path. A first set of processing-in-memory (PIM) cores and a second set of PIM cores are configured to be in communication with a memory included in the device over a second communication path, wherein the first set of PIM cores have greater processing power than the second set of PIM cores, and wherein the second communication path has a greater bandwidth for data transfer than the first communication path. Code offloaded by the one or more host cores are executed in the first set of PIM cores and the second set of PIM cores.

    APPARATUS AND METHOD FOR A TENSOR PERMUTATION ENGINE

    公开(公告)号:US20210182059A1

    公开(公告)日:2021-06-17

    申请号:US17131424

    申请日:2020-12-22

    Inventor: Berkin AKIN

    Abstract: An apparatus and method for a tensor permutation engine. The TPE may include a read address generation unit (AGU) to generate a plurality of read addresses for the plurality of tensor data elements in a first storage and a write AGU to generate a plurality of write addresses for the plurality of tensor data elements in the first storage. The TPE may include a shuffle register bank comprising a register to read tensor data elements from the plurality of read addresses generated by the read AGU, a first register bank to receive the tensor data elements, and a shift register to receive a lowest tensor data element from each bank in the first register bank, each tensor data element in the shift register to be written to a write address from the plurality of write addresses generated by the write AGU.

Patent Agency Ranking