-
11.
公开(公告)号:US11854174B2
公开(公告)日:2023-12-26
申请号:US17851704
申请日:2022-06-28
Applicant: Samsung Electronics Co., Ltd.
Inventor: Dinesh Kumar Yadav , Ankur Deshwal , Saptarsi Das , Junwoo Jang , Sehwan Lee
IPC: G06T5/20 , G06N3/08 , G06F18/2111 , G06T1/00
CPC classification number: G06T5/20 , G06F18/2111 , G06N3/08 , G06T1/0007
Abstract: A method of performing convolution in a neural network with variable dilation rate is provided. The method includes receiving a size of a first kernel and a dilation rate, determining at least one of size of one or more disintegrated kernels based on the size of the first kernel, a baseline architecture of a memory and the dilation rate, determining an address of one or more blocks of an input image based on the dilation rate, and one or more parameters associated with a size of the input image and the memory. Thereafter, the one or more blocks of the input image and the one or more disintegrated kernels are fetched from the memory, and an output image is obtained based on convolution of each of the one or more disintegrated kernels and the one or more blocks of the input image.
-
公开(公告)号:US20230351151A1
公开(公告)日:2023-11-02
申请号:US18219904
申请日:2023-07-10
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ilia Ovsiannikov , Ali Shafiee Ardestani , Joseph H. Hassoun , Lei Wang , Sehwan Lee , JoonHo Song , Jun-Woo Jang , Yibing Michelle Wang , Yuecheng Li
CPC classification number: G06N3/04 , G06F17/153 , G06F17/16 , G06N3/08 , G06T9/002 , G06F9/3001
Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.
-
13.
公开(公告)号:US11604958B2
公开(公告)日:2023-03-14
申请号:US16816861
申请日:2020-03-12
Applicant: Samsung Electronics Co., Ltd.
Inventor: Saptarsi Das , Sabitha Kusuma , Sehwan Lee , Ankur Deshwal , Kiran Kolar Chandrasekharan
Abstract: A method and an apparatus for processing layers in a neural network fetch Input Feature Map (IFM) tiles of an IFM tensor and kernel tiles of a kernel tensor, perform a convolutional operation on the IFM tiles and the kernel tiles by exploiting IFM sparsity and kernel sparsity, and generate a plurality of OFM tiles corresponding to the IFM tiles.
-
公开(公告)号:US11521039B2
公开(公告)日:2022-12-06
申请号:US16168418
申请日:2018-10-23
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Hyunsun Park , Wonjo Lee , Sehwan Lee , Seungwon Lee
IPC: G06N3/04 , G06N3/08 , G06F17/15 , G06N5/04 , G06F16/901
Abstract: A process-implemented neural network method includes obtaining a plurality of kernels and an input feature map; determining a pruning index indicating a weight location where pruning is to be performed commonly within the plurality of kernels; and performing a Winograd-based convolution operation by pruning a weight corresponding to the determined pruning index with respect to each of the plurality of kernels.
-
公开(公告)号:US20190121567A1
公开(公告)日:2019-04-25
申请号:US15984611
申请日:2018-05-21
Applicant: Samsung Electronics Co., Ltd.
Inventor: Jong Hwa KIM , Sehwan Lee
IPC: G06F3/06
CPC classification number: G06F3/0647 , G06F3/0604 , G06F3/0616 , G06F3/0625 , G06F3/064 , G06F3/0644 , G06F3/0656 , G06F3/0679 , G06F3/0688
Abstract: The data storage device including a buffer configured to receive first information including first data and a first stream class number identifying characteristics of the first data and second information including second data and a second stream class number identifying characteristics of the second data and store the first and second information therein, the second stream class number being different from the first stream class number, a non-volatile memory including a shared memory area and a dedicated memory area different from the shared memory area and configured to store the first and second data stored in the buffer, the non-volatile memory, and a controller configured to control the buffer and the non-volatile memory, the controller configured to store the first and second data stored in the shared memory area, and then migrate the first data stored in the shared memory area to the dedicated memory area may be provided.
-
公开(公告)号:US12086700B2
公开(公告)日:2024-09-10
申请号:US16552619
申请日:2019-08-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ilia Ovsiannikov , Ali Shafiee Ardestani , Joseph H. Hassoun , Lei Wang , Sehwan Lee , JoonHo Song , Jun-Woo Jang , Yibing Michelle Wang , Yuecheng Li
CPC classification number: G06N3/04 , G06F17/153 , G06F17/16 , G06N3/08 , G06T9/002 , G06F9/3001
Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.
-
公开(公告)号:US11971823B2
公开(公告)日:2024-04-30
申请号:US17317339
申请日:2021-05-11
Applicant: Samsung Electronics Co., Ltd
Inventor: Yoojin Kim , Channoh Kim , Hyun Sun Park , Sehwan Lee , Jun-Woo Jang
IPC: G06F12/0862 , G06F1/04 , G06F12/0804 , G06F17/15 , G06F17/16 , G06N3/04 , G06N3/045 , G06N3/063 , G06N3/08
CPC classification number: G06F12/0862 , G06F12/0804 , G06F1/04 , G06F2212/1021 , G06N3/04
Abstract: A computing method and device with data sharing are provided. The method includes loading, by a loader, input data of an input feature map stored in a memory in loading units according to a loading order, storing, by a buffer controller, the loaded input data in a reuse buffer of an address rotationally allocated according to the loading order, and transmitting, by each of a plurality of senders, to an executer respective input data corresponding to each output data of respective convolution operations among the input data stored in the reuse buffer, wherein portions of the transmitted respective input data overlap other.
-
公开(公告)号:US11783162B2
公开(公告)日:2023-10-10
申请号:US16552619
申请日:2019-08-27
Applicant: Samsung Electronics Co., Ltd.
Inventor: Ilia Ovsiannikov , Ali Shafiee Ardestani , Joseph H. Hassoun , Lei Wang , Sehwan Lee , JoonHo Song , Jun-Woo Jang , Yibing Michelle Wang , Yuecheng Li
CPC classification number: G06N3/04 , G06F17/153 , G06F17/16 , G06N3/08 , G06T9/002 , G06F9/3001
Abstract: A neural processor. In some embodiments, the processor includes a first tile, a second tile, a memory, and a bus. The bus may be connected to the memory, the first tile, and the second tile. The first tile may include: a first weight register, a second weight register, an activations buffer, a first multiplier, and a second multiplier. The activations buffer may be configured to include: a first queue connected to the first multiplier and a second queue connected to the second multiplier. The first queue may include a first register and a second register adjacent to the first register, the first register being an output register of the first queue. The first tile may be configured: in a first state: to multiply, in the first multiplier, a first weight by an activation from the output register of the first queue, and in a second state: to multiply, in the first multiplier, the first weight by an activation from the second register of the first queue.
-
公开(公告)号:US11663473B2
公开(公告)日:2023-05-30
申请号:US17112041
申请日:2020-12-04
Applicant: SAMSUNG ELECTRONICS CO., LTD.
Inventor: Joonho Song , Sehwan Lee , Junwoo Jang
CPC classification number: G06N3/08 , G06F17/153 , G06N3/04 , G06N3/045
Abstract: A neural network apparatus configured to perform a deconvolution operation includes a memory configured to store a first kernel; and a processor configured to: obtain, from the memory, the first kernel; calculate a second kernel by adjusting an arrangement of matrix elements comprised in the first kernel; generate sub-kernels by dividing the second kernel; perform a convolution operation between an input feature map and the sub-kernels using a convolution operator; and generate an output feature map, as a deconvolution of the input feature map, by merging results of the convolution operation.
-
公开(公告)号:US20220036243A1
公开(公告)日:2022-02-03
申请号:US17147858
申请日:2021-01-13
Applicant: Samsung Electronics Co., Ltd.
Inventor: Saptarsi Das , Sabitha Kusuma , Arnab Roy , Ankur Deshwal , Kiran Kolar Chandrasekharan , Sehwan Lee
Abstract: An apparatus includes a global memory and a systolic array. The global memory is configured to store and provide an input feature map (IFM) vector stream from an IFM tensor and a kernel vector stream from a kernel tensor. The systolic array is configured to receive the IFM vector stream and the kernel vector stream from the global memory. The systolic array is on-chip together with the global memory. The systolic array includes a plurality of processing elements (PEs) each having a plurality of vector units, each of the plurality of vector units being configured to perform a dot-product operation on at least one IFM vector of the IFM vector stream and at least one kernel vector of the kernel vector stream per unit clock cycle to generate a plurality of output feature maps (OFMs).
-
-
-
-
-
-
-
-
-