Patent search ap:("Samsung Electronics Co. Page Ltd.") AND inv:"Joseph Hassoun"

1.

发明授权
Processor with outlier accommodation 有权

公开(公告)号：US12229659B2

公开(公告)日：2025-02-18

申请号：US17110266

申请日：2020-12-02

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06N3/063 , G06F7/50 , G06F7/523 , G06F7/53 , G06F7/544

Abstract: A system and method for performing sets of multiplications in a manner that accommodates outlier values. In some embodiments the method includes: forming a first set of products, each product of the first set of products being a product of a first activation value and a respective weight of a first plurality of weights. The forming of the first set of products may include multiplying, in a first multiplier, the first activation value and a least significant sub-word of a first weight to form a first partial product; multiplying, in a second multiplier, the first activation value and a least significant sub-word of a second weight; multiplying, in a third multiplier, the first activation value and a most significant sub-word of the first weight to form a second partial product; and adding the first partial product and the second partial product.

2.

发明授权
Runtime reconfigurable compression format conversion 有权

公开(公告)号：US12224774B2

公开(公告)日：2025-02-11

申请号：US18096551

申请日：2023-01-12

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon Shin , Ardavan Pedram , Joseph Hassoun

IPC: H03M7/30 , G06F8/41 , G06N3/0495 , G06N3/063 , H03M7/46

Abstract: A runtime data-format optimizer for a processing element includes a sparsity-detector and a compression-converter. The sparsity-detector selects a first compression-conversion format during a runtime of the processing element based on a performance model that is based on a first sparsity pattern of first data stored in a first memory that is exterior to the processing element and a second sparsity pattern of second data that is to be stored in a second memory within the processing element. The second sparsity pattern is based on a runtime configuration of the processing element. The first data is stored in the first memory using a first compression format and the second data is to be stored in the second memory using a second compression format. The compression-conversion circuit converts the first compression format of the first data to be the second compression format of the second data based on the first compression-conversion format.

3.

发明申请
PROCESSOR WITH OUTLIER ACCOMMODATION 有权

公开(公告)号：US20220114425A1

公开(公告)日：2022-04-14

申请号：US17110266

申请日：2020-12-02

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06N3/063 , G06F7/544 , G06F7/523 , G06F7/50

Abstract: A system and method for performing sets of multiplications in a manner that accommodates outlier values. In some embodiments the method includes: forming a first set of products, each product of the first set of products being a product of a first activation value and a respective weight of a first plurality of weights. The forming of the first set of products may include multiplying, in a first multiplier, the first activation value and a least significant sub-word of a first weight to form a first partial product; multiplying, in a second multiplier, the first activation value and a least significant sub-word of a second weight; multiplying, in a third multiplier, the first activation value and a most significant sub-word of the first weight to form a second partial product; and adding the first partial product and the second partial product.

4.

发明授权
Mixed-precision neural processing unit (NPU) using spatial fusion with load balancing 有权

公开(公告)号：US12001929B2

公开(公告)日：2024-06-04

申请号：US16898433

申请日：2020-06-10

Applicant: Samsung Electronics Co., Ltd.

Inventor： Hamzah Abdelaziz , Joseph Hassoun , Ali Shafiee Ardestani

IPC: G06N20/00 , H04L67/1001

CPC classification number: G06N20/00 , H04L67/1001

Abstract: According to one general aspect, an apparatus may include a machine learning system. The machine learning system may include a precision determination circuit configured to: determine a precision level of data, and divide the data into a data subdivision. The machine learning system may exploit sparsity during the computation of each subdivision. The machine learning system may include a load balancing circuit configured to select a load balancing technique, wherein the load balancing technique includes alternately loading the computation circuit with at least a first data/weight subdivision combination and a second data/weight subdivision combination. The load balancing circuit may be configured to load a computation circuit with a selected data subdivision and a selected weight subdivision based, at least in part, upon the load balancing technique. The machine learning system may include a computation circuit configured to compute a partial computation result based, at least in part, upon the selected data subdivision and the weight subdivision.

5.

发明公开
SYSTEM AND METHOD FOR INCREASING UTILIZATION OF DOT-PRODUCT BASED NEURAL NETWORK ACCELERATOR 审中-公开

公开(公告)号：US20230289584A1

公开(公告)日：2023-09-14

申请号：US18320133

申请日：2023-05-18

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06N3/063 , G06F7/24 , G06F7/544 , G06F9/50 , G06N5/04

CPC classification number: G06N3/063 , G06F7/24 , G06F7/5443 , G06F9/5027 , G06N5/04

Abstract: A method of flattening channel data of an input feature map in an inference system includes retrieving pixel values of a channel of a plurality of channels of the input feature map from a memory and storing the pixel values in a buffer, extracting first values of a first region having a first size from among the pixel values stored in the buffer, the first region corresponding to an overlap region of a kernel of the inference system with channel data of the input feature map, rearranging second values corresponding to the overlap region of the kernel from among the first values in the first region, and identifying a first group of consecutive values from among the rearranged second values for supplying to a first dot-product circuit of the inference system.

6.

发明授权
System and method for increasing utilization of dot-product based neural network accelerator 有权

公开(公告)号：US11687764B2

公开(公告)日：2023-06-27

申请号：US16900852

申请日：2020-06-12

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06F7/22 , G06N3/063 , G06F7/24 , G06F7/544 , G06F9/50 , G06N5/04

CPC classification number: G06N3/063 , G06F7/24 , G06F7/5443 , G06F9/5027 , G06N5/04

Abstract: A method of flattening channel data of an input feature map in an inference system includes retrieving pixel values of a channel of a plurality of channels of the input feature map from a memory and storing the pixel values in a buffer, extracting first values of a first region having a first size from among the pixel values stored in the buffer, the first region corresponding to an overlap region of a kernel of the inference system with channel data of the input feature map, rearranging second values corresponding to the overlap region of the kernel from among the first values in the first region, and identifying a first group of consecutive values from among the rearranged second values for supplying to a first dot-product circuit of the inference system.

7.

发明申请
SYSTEM AND METHOD FOR PERFORMING COMPUTATIONS FOR DEEP NEURAL NETWORKS 有权

公开(公告)号：US20230047273A1

公开(公告)日：2023-02-16

申请号：US17966488

申请日：2022-10-14

Applicant: Samsung Electronics Co., Ltd.

Inventor： Hamzah Abdelaziz , Joseph Hassoun , Ali Shafiee Ardestani

IPC: G06N3/063 , G06F9/30 , G06F9/54 , G06N3/04 , G06N3/08

Abstract: A computation unit for performing a computation of a neural network layer is disclosed. A number of processing element (PE) units are arranged in an array. First input values are provided in parallel in an input dimension of the array during a first processing period, and a second input values are provided in parallel in the input dimension during a second processing period. Computations are performed by the PE units based on stored weight values. An adder coupled to the first set of PE units generates a first sum of results of the computations by the first set of PE units during the first processing cycle, and generates a second sum of results of the computations during the second processing cycle. A first accumulator coupled to the first adder stores the first sum, and further shifts the first sum to a second accumulator prior to storing the second sum.

8.

发明授权
Accelerating 2D convolutional layer mapping on a dot product architecture 有权

公开(公告)号：US12112141B2

公开(公告)日：2024-10-08

申请号：US16900819

申请日：2020-06-12

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06F7/544 , G06F9/30 , G06N3/063

CPC classification number: G06F7/5443 , G06F9/30105 , G06N3/063

Abstract: A method for performing a convolution operation includes storing, a convolution kernel in a first storage device, the convolution kernel having dimensions x by y; storing, in a second storage device, a first subset of element values of an input feature map having dimensions n by m; performing a first simultaneous multiplication, of each value of the first subset of element values of the input feature map with a first element value from among the x*y elements of the convolution kernel; for each remaining value of the x*y elements of the convolution kernel, performing, a simultaneous multiplication of the remaining value with a corresponding subset of element values of the input feature map; for each simultaneous multiplication, storing, result of the simultaneous multiplication in an accumulator; and outputting, the values of the accumulator as a first row of an output feature map.

9.

发明授权
System and method for performing computations for deep neural networks 有权

公开(公告)号：US11681907B2

公开(公告)日：2023-06-20

申请号：US17966488

申请日：2022-10-14

Applicant: Samsung Electronics Co., Ltd.

Inventor： Hamzah Abdelaziz , Joseph Hassoun , Ali Shafiee Ardestani

IPC: G06N7/00 , G06N3/065 , G06F9/30 , G06F9/54 , G06N3/04 , G06N3/08

CPC classification number: G06N3/065 , G06F9/3001 , G06F9/30032 , G06F9/30036 , G06F9/30101 , G06F9/545 , G06N3/04 , G06N3/08

Abstract: A computation unit for performing a computation of a neural network layer is disclosed. A number of processing element (PE) units are arranged in an array. First input values are provided in parallel in an input dimension of the array during a first processing period, and a second input values are provided in parallel in the input dimension during a second processing period. Computations are performed by the PE units based on stored weight values. An adder coupled to the first set of PE units generates a first sum of results of the computations by the first set of PE units during the first processing cycle, and generates a second sum of results of the computations during the second processing cycle. A first accumulator coupled to the first adder stores the first sum, and further shifts the first sum to a second accumulator prior to storing the second sum.

10.

发明授权
System and method for performing computations for deep neural networks 有权

公开(公告)号：US11507817B2

公开(公告)日：2022-11-22

申请号：US16900845

申请日：2020-06-12

Applicant: Samsung Electronics Co., Ltd.

Inventor： Hamzah Abdelaziz , Joseph Hassoun , Ali Shafiee Ardestani

IPC: G06F9/46 , G06N3/063 , G06F9/30 , G06F9/54 , G06N3/04 , G06N3/08

Abstract: A computation unit for performing a computation of a neural network layer is disclosed. A number of processing element (PE) units are arranged in an array. First input values are provided in parallel in an input dimension of the array during a first processing period, and a second input values are provided in parallel in the input dimension during a second processing period. Computations are performed by the PE units based on stored weight values. An adder coupled to the first set of PE units generates a first sum of results of the computations by the first set of PE units during the first processing cycle, and generates a second sum of results of the computations during the second processing cycle. A first accumulator coupled to the first adder stores the first sum, and further shifts the first sum to a second accumulator prior to storing the second sum.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification