Patent search ap:("Samsung Electronics Co. Page Ltd.") AND inv:"Peng GU"

1.

发明申请
DATAFLOW ACCELERATOR ARCHITECTURE FOR GENERAL MATRIX-MATRIX MULTIPLICATION AND TENSOR COMPUTATION IN DEEP LEARNING 有权

公开(公告)号：US20210374210A1

公开(公告)日：2021-12-02

申请号：US17374988

申请日：2021-07-13

Applicant: Samsung Electronics Co., Ltd.

Inventor： Peng GU , Krishna MALLADI , Hongzhong ZHENG , Dimin NIU

IPC: G06F17/16 , G06F12/0877 , G06F12/0802 , G06N3/063 , G06N3/00 , G06N3/04

Abstract: A general matrix-matrix multiplication (GEMM) dataflow accelerator circuit is disclosed that includes a smart 3D stacking DRAM architecture. The accelerator circuit includes a memory bank, a peripheral lookup table stored in the memory bank, and a first vector buffer to store a first vector that is used as a row address into the lookup table. The circuit includes a second vector buffer to store a second vector that is used as a column address into the lookup table, and lookup table buffers to receive and store lookup table entries from the lookup table. The circuit further includes adders to sum the first product and a second product, and an output buffer to store the sum. The lookup table buffers determine a product of the first vector and the second vector without performing a multiply operation. The embodiments include a hierarchical lookup architecture to reduce latency. Accumulation results are propagated in a systolic manner.

2.

发明申请
HBM SILICON PHOTONIC TSV ARCHITECTURE FOR LOOKUP COMPUTING AI ACCELERATOR 审中-公开

公开(公告)号：US20190214365A1

公开(公告)日：2019-07-11

申请号：US15911063

申请日：2018-03-02

Applicant: Samsung Electronics Co., Ltd.

Inventor： Peng GU , Krishna MALLADI , Hongzhong ZHENG

IPC: H01L25/065 , H01L31/12 , H01L31/02 , H01L31/0232 , H01L25/18 , G02F1/01 , H04B10/80 , H04Q11/00

Abstract: According to one general aspect, an apparatus may include a memory circuit die configured to store a lookup table that converts first data to second data. The apparatus may also include a logic circuit die comprising combinatorial logic circuits configured to receive the second data. The apparatus may further include an optical via coupled between the memory circuit die and the logical circuit die and configured to transfer second data between the memory circuit die and the logic circuit die.

3.

发明申请
DATAFLOW ACCELERATOR ARCHITECTURE FOR GENERAL MATRIX-MATRIX MULTIPLICATION AND TENSOR COMPUTATION IN DEEP LEARNING 审中-公开

公开(公告)号：US20200183837A1

公开(公告)日：2020-06-11

申请号：US16388863

申请日：2019-04-18

Applicant: Samsung Electronics Co., Ltd.

Inventor： Peng GU , Krishna MALLADI , Hongzhong ZHENG , Dimin NIU

IPC: G06F12/0802 , G06F17/16

Abstract: A tensor computation dataflow accelerator semiconductor circuit is disclosed. The data flow accelerator includes a DRAM bank and a peripheral array of multiply-and-add units disposed adjacent to the DRAM bank. The peripheral array of multiply-and-add units are configured to form a pipelined dataflow chain in which partial output data from one multiply-and-add unit from among the array of multiply-and-add units is fed into another multiply-and-add unit from among the array of multiply-and-add units for data accumulation. Near-DRAM-processing dataflow (NDP-DF) accelerator unit dies may be stacked atop a base die. The base die may be disposed on a passive silicon interposer adjacent to a processor or a controller. The NDP-DF accelerator units may process partial matrix output data in parallel. The partial matrix output data may be propagated in a forward or backward direction. The tensor computation dataflow accelerator may perform a partial matrix transposition.

4.

发明申请
HBM SILICON PHOTONIC TSV ARCHITECTURE FOR LOOKUP COMPUTING AI ACCELERATOR 有权

公开(公告)号：US20220367412A1

公开(公告)日：2022-11-17

申请号：US17873120

申请日：2022-07-25

Applicant: Samsung Electronics Co., Ltd.

Inventor： Peng GU , Krishna MALLADI , Hongzhong ZHENG

IPC: H01L25/065 , H01L31/12 , H01L31/02 , H01L31/0232 , H01L25/18 , H04B10/80 , H04Q11/00 , G02F1/01

Abstract: According to one general aspect, an apparatus may include a memory circuit die configured to store a lookup table that converts first data to second data. The apparatus may also include a logic circuit die comprising combinatorial logic circuits configured to receive the second data. The apparatus may further include an optical via coupled between the memory circuit die and the logical circuit die and configured to transfer second data between the memory circuit die and the logic circuit die.

5.

发明申请
DATAFLOW ACCELERATOR ARCHITECTURE FOR GENERAL MATRIX-MATRIX MULTIPLICATION AND TENSOR COMPUTATION IN DEEP LEARNING 审中-公开

公开(公告)号：US20200184001A1

公开(公告)日：2020-06-11

申请号：US16388860

申请日：2019-04-18

Applicant: Samsung Electronics Co., Ltd.

Inventor： Peng GU , Krishna MALLADI , Hongzhong ZHENG , Dimin NIU

IPC: G06F17/16 , G06F12/0877

Abstract: A general matrix-matrix multiplication (GEMM) dataflow accelerator circuit is disclosed that includes a smart 3D stacking DRAM architecture. The accelerator circuit includes a memory bank, a peripheral lookup table stored in the memory bank, and a first vector buffer to store a first vector that is used as a row address into the lookup table. The circuit includes a second vector buffer to store a second vector that is used as a column address into the lookup table, and lookup table buffers to receive and store lookup table entries from the lookup table. The circuit further includes adders to sum the first product and a second product, and an output buffer to store the sum. The lookup table buffers determine a product of the first vector and the second vector without performing a multiply operation. The embodiments include a hierarchical lookup architecture to reduce latency. Accumulation results are propagated in a systolic manner.

Patent Agency Ranking