-
公开(公告)号:US20220036243A1
公开(公告)日:2022-02-03
申请号:US17147858
申请日:2021-01-13
Applicant: Samsung Electronics Co., Ltd.
Inventor: Saptarsi Das , Sabitha Kusuma , Arnab Roy , Ankur Deshwal , Kiran Kolar Chandrasekharan , Sehwan Lee
Abstract: An apparatus includes a global memory and a systolic array. The global memory is configured to store and provide an input feature map (IFM) vector stream from an IFM tensor and a kernel vector stream from a kernel tensor. The systolic array is configured to receive the IFM vector stream and the kernel vector stream from the global memory. The systolic array is on-chip together with the global memory. The systolic array includes a plurality of processing elements (PEs) each having a plurality of vector units, each of the plurality of vector units being configured to perform a dot-product operation on at least one IFM vector of the IFM vector stream and at least one kernel vector of the kernel vector stream per unit clock cycle to generate a plurality of output feature maps (OFMs).
-
2.
公开(公告)号:US11423251B2
公开(公告)日:2022-08-23
申请号:US16733314
申请日:2020-01-03
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dinesh Kumar Yadav , Ankur Deshwal , Saptarsi Das , Junwoo Jang , Sehwan Lee
Abstract: A method of performing convolution in a neural network with variable dilation rate is provided. The method includes receiving a size of a first kernel and a dilation rate, determining at least one of size of one or more disintegrated kernels based on the size of the first kernel, a baseline architecture of a memory and the dilation rate, determining an address of one or more blocks of an input image based on the dilation rate, and one or more parameters associated with a size of the input image and the memory. Thereafter, the one or more blocks of the input image and the one or more disintegrated kernels are fetched from the memory, and an output image is obtained based on convolution of each of the one or more disintegrated kernels and the one or more blocks of the input image.
-
公开(公告)号:US20240311009A1
公开(公告)日:2024-09-19
申请号:US18439092
申请日:2024-02-12
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Arnab Roy , Saptarsi Das , Kiran Kolar Chandrasekharan , Yeongon CHO
IPC: G06F3/06
CPC classification number: G06F3/061 , G06F3/0629 , G06F3/0673
Abstract: Disclosed are a method of accessing a memory and an electronic device for performing the method. The electronic device includes a processor, and a memory electrically connected to the processor, wherein the processor may be configured to select a rank including bank groups of the memory, select a bank corresponding to a memory address to be accessed from among banks included in the selected rank, select a row and one or more columns from rows and columns of the selected bank corresponding to the memory address, and generate the memory address to access the memory based on an address mapping scheme according to the selected rank, the selected bank, the selected row, and the selected one or more columns.
-
公开(公告)号:US12039430B2
公开(公告)日:2024-07-16
申请号:US17098589
申请日:2020-11-16
Applicant: Samsung Electronics Co., Ltd.
Inventor: Arnab Roy , Saptarsi Das , Ankur Deshwal , Kiran Kolar Chandrasek Haran , Sehwan Lee
CPC classification number: G06N3/045 , G06F7/5443
Abstract: A method for computing an inner product on a binary data, a ternary data, a non-binary data, and a non-ternary data using an electronic device. The method includes calculating the inner product on a ternary data, designing a fused bitwise data path to support the inner product calculation on the binary data and the ternary data, designing a FPL data path to calculate an inner product between one of the non-binary data and the non-ternary data and one of the binary data and the ternary data, and distributing the inner product calculation for the binary data and the ternary data and the inner product between one of the non-binary data and the non-ternary data and one of the binary data and the ternary data in the fused bitwise data path and the FPL data path.
-
公开(公告)号:US11915118B2
公开(公告)日:2024-02-27
申请号:US18107210
申请日:2023-02-08
Applicant: Samsung Electronics Co., Ltd.
Inventor: Saptarsi Das , Sabitha Kusuma , Sehwan Lee , Ankur Deshwal , Kiran Kolar Chandrasekharan
Abstract: A method and an apparatus for processing layers in a neural network fetch Input Feature Map (IFM) tiles of an IFM tensor and kernel tiles of a kernel tensor, perform a convolutional operation on the IFM tiles and the kernel tiles by exploiting IFM sparsity and kernel sparsity, and generate a plurality of OFM tiles corresponding to the IFM tiles.
-
6.
公开(公告)号:US11854174B2
公开(公告)日:2023-12-26
申请号:US17851704
申请日:2022-06-28
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dinesh Kumar Yadav , Ankur Deshwal , Saptarsi Das , Junwoo Jang , Sehwan Lee
IPC: G06T5/20 , G06N3/08 , G06F18/2111 , G06T1/00
CPC classification number: G06T5/20 , G06F18/2111 , G06N3/08 , G06T1/0007
Abstract: A method of performing convolution in a neural network with variable dilation rate is provided. The method includes receiving a size of a first kernel and a dilation rate, determining at least one of size of one or more disintegrated kernels based on the size of the first kernel, a baseline architecture of a memory and the dilation rate, determining an address of one or more blocks of an input image based on the dilation rate, and one or more parameters associated with a size of the input image and the memory. Thereafter, the one or more blocks of the input image and the one or more disintegrated kernels are fetched from the memory, and an output image is obtained based on convolution of each of the one or more disintegrated kernels and the one or more blocks of the input image.
-
公开(公告)号:US11604958B2
公开(公告)日:2023-03-14
申请号:US16816861
申请日:2020-03-12
Applicant: Samsung Electronics Co., Ltd.
Inventor: Saptarsi Das , Sabitha Kusuma , Sehwan Lee , Ankur Deshwal , Kiran Kolar Chandrasekharan
Abstract: A method and an apparatus for processing layers in a neural network fetch Input Feature Map (IFM) tiles of an IFM tensor and kernel tiles of a kernel tensor, perform a convolutional operation on the IFM tiles and the kernel tiles by exploiting IFM sparsity and kernel sparsity, and generate a plurality of OFM tiles corresponding to the IFM tiles.
-
-
-
-
-
-