-
公开(公告)号:US20220036243A1
公开(公告)日:2022-02-03
申请号:US17147858
申请日:2021-01-13
Applicant: Samsung Electronics Co., Ltd.
Inventor: Saptarsi Das , Sabitha Kusuma , Arnab Roy , Ankur Deshwal , Kiran Kolar Chandrasekharan , Sehwan Lee
Abstract: An apparatus includes a global memory and a systolic array. The global memory is configured to store and provide an input feature map (IFM) vector stream from an IFM tensor and a kernel vector stream from a kernel tensor. The systolic array is configured to receive the IFM vector stream and the kernel vector stream from the global memory. The systolic array is on-chip together with the global memory. The systolic array includes a plurality of processing elements (PEs) each having a plurality of vector units, each of the plurality of vector units being configured to perform a dot-product operation on at least one IFM vector of the IFM vector stream and at least one kernel vector of the kernel vector stream per unit clock cycle to generate a plurality of output feature maps (OFMs).
-
公开(公告)号:US11915118B2
公开(公告)日:2024-02-27
申请号:US18107210
申请日:2023-02-08
Applicant: Samsung Electronics Co., Ltd.
Inventor: Saptarsi Das , Sabitha Kusuma , Sehwan Lee , Ankur Deshwal , Kiran Kolar Chandrasekharan
Abstract: A method and an apparatus for processing layers in a neural network fetch Input Feature Map (IFM) tiles of an IFM tensor and kernel tiles of a kernel tensor, perform a convolutional operation on the IFM tiles and the kernel tiles by exploiting IFM sparsity and kernel sparsity, and generate a plurality of OFM tiles corresponding to the IFM tiles.
-
公开(公告)号:US11604958B2
公开(公告)日:2023-03-14
申请号:US16816861
申请日:2020-03-12
Applicant: Samsung Electronics Co., Ltd.
Inventor: Saptarsi Das , Sabitha Kusuma , Sehwan Lee , Ankur Deshwal , Kiran Kolar Chandrasekharan
Abstract: A method and an apparatus for processing layers in a neural network fetch Input Feature Map (IFM) tiles of an IFM tensor and kernel tiles of a kernel tensor, perform a convolutional operation on the IFM tiles and the kernel tiles by exploiting IFM sparsity and kernel sparsity, and generate a plurality of OFM tiles corresponding to the IFM tiles.
-
-