Computing accelerator using a lookup table

    公开(公告)号:US10732929B2

    公开(公告)日:2020-08-04

    申请号:US15916196

    申请日:2018-03-08

    Abstract: A computing accelerator using a lookup table. The accelerator may accelerate floating point multiplications by retrieving the fraction portion of the product of two floating-point operands from a lookup table, or by retrieving the product of two floating-point operands of two floating-point operands from a lookup table, or it may retrieve dot products of floating point vectors from a lookup table. The accelerator may be implemented in a three-dimensional memory assembly. It may use approximation, the symmetry of a multiplication lookup table, and zero-skipping to improve performance.

    HBM BASED MEMORY LOOKUP ENGINE FOR DEEP LEARNING ACCELERATOR

    公开(公告)号:US20190187898A1

    公开(公告)日:2019-06-20

    申请号:US15916228

    申请日:2018-03-08

    CPC classification number: G06F3/064 G06F3/0604 G06F3/0673 G06N3/08

    Abstract: A storage device and method of controlling a storage device are disclosed. The storage device includes a host, a logic die, and a high bandwidth memory stack including a memory die. A computation lookup table is stored on a memory array of the memory die. The host sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, includes finding the product of a weight of the kernel and values of multiple input feature maps. The computation lookup table includes a row corresponding to a weight of the kernel, and a column corresponding to a value of the input feature maps. A result value stored at a position corresponding to a row and a column is the product of the weight corresponding to the row and the value corresponding to the column.

    HBM BASED MEMORY LOOKUP ENGINE FOR DEEP LEARNING ACCELERATOR

    公开(公告)号:US20230289081A1

    公开(公告)日:2023-09-14

    申请号:US18315821

    申请日:2023-05-11

    CPC classification number: G06F3/064 G06N3/08 G06F3/0673 G06F3/0604

    Abstract: A storage device and method of controlling a storage device are disclosed. The storage device includes a host, a logic die, and a high bandwidth memory stack including a memory die. A computation lookup table is stored on a memory array of the memory die. The host sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, includes finding the product of a weight of the kernel and values of multiple input feature maps. The computation lookup table includes a row corresponding to a weight of the kernel, and a column corresponding to a value of the input feature maps. A result value stored at a position corresponding to a row and a column is the product of the weight corresponding to the row and the value corresponding to the column.

    HBM BASED MEMORY LOOKUP ENGINE FOR DEEP LEARNING ACCELERATOR

    公开(公告)号:US20210405877A1

    公开(公告)日:2021-12-30

    申请号:US17473532

    申请日:2021-09-13

    Abstract: A storage device and method of controlling a storage device are disclosed. The storage device includes a host, a logic die, and a high bandwidth memory stack including a memory die. A computation lookup table is stored on a memory array of the memory die. The host sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, includes finding the product of a weight of the kernel and values of multiple input feature maps. The computation lookup table includes a row corresponding to a weight of the kernel, and a column corresponding to a value of the input feature maps. A result value stored at a position corresponding to a row and a column is the product of the weight corresponding to the row and the value corresponding to the column.

    Dataflow accelerator architecture for general matrix-matrix multiplication and tensor computation in deep learning

    公开(公告)号:US11100193B2

    公开(公告)日:2021-08-24

    申请号:US16388860

    申请日:2019-04-18

    Abstract: A general matrix-matrix multiplication (GEMM) dataflow accelerator circuit is disclosed that includes a smart 3D stacking DRAM architecture. The accelerator circuit includes a memory bank, a peripheral lookup table stored in the memory bank, and a first vector buffer to store a first vector that is used as a row address into the lookup table. The circuit includes a second vector buffer to store a second vector that is used as a column address into the lookup table, and lookup table buffers to receive and store lookup table entries from the lookup table. The circuit further includes adders to sum the first product and a second product, and an output buffer to store the sum. The lookup table buffers determine a product of the first vector and the second vector without performing a multiply operation. The embodiments include a hierarchical lookup architecture to reduce latency. Accumulation results are propagated in a systolic manner.

    Dataflow accelerator architecture for general matrix-matrix multiplication and tensor computation in deep learning

    公开(公告)号:US12164593B2

    公开(公告)日:2024-12-10

    申请号:US17374988

    申请日:2021-07-13

    Abstract: A general matrix-matrix multiplication (GEMM) dataflow accelerator circuit is disclosed that includes a smart 3D stacking DRAM architecture. The accelerator circuit includes a memory bank, a peripheral lookup table stored in the memory bank, and a first vector buffer to store a first vector that is used as a row address into the lookup table. The circuit includes a second vector buffer to store a second vector that is used as a column address into the lookup table, and lookup table buffers to receive and store lookup table entries from the lookup table. The circuit further includes adders to sum the first product and a second product, and an output buffer to store the sum. The lookup table buffers determine a product of the first vector and the second vector without performing a multiply operation. The embodiments include a hierarchical lookup architecture to reduce latency. Accumulation results are propagated in a systolic manner.

    COMPUTING ACCELERATOR USING A LOOKUP TABLE
    20.
    发明申请

    公开(公告)号:US20190212980A1

    公开(公告)日:2019-07-11

    申请号:US15916196

    申请日:2018-03-08

    Abstract: A computing accelerator using a lookup table. The accelerator may accelerate floating point multiplications by retrieving the fraction portion of the product of two floating-point operands from a lookup table, or by retrieving the product of two floating-point operands of two floating-point operands from a lookup table, or it may retrieve dot products of floating point vectors from a lookup table. The accelerator may be implemented in a three-dimensional memory assembly. It may use approximation, the symmetry of a multiplication lookup table, and zero-skipping to improve performance.

Patent Agency Ranking