Patent search ap:("Samsung Electronics Co. Page Ltd.") AND inv:"Joseph Hassoun"

11.

发明申请
PROCESSOR FOR FINE-GRAIN SPARSE INTEGER AND FLOATING-POINT OPERATIONS 有权

公开(公告)号：US20220147312A1

公开(公告)日：2022-05-12

申请号：US17131357

申请日：2020-12-22

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06F7/487 , G06F7/485 , G06F7/544

Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof. In some embodiments, the method includes forming a first set of products and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a first activation value by a least significant sub-word and a most significant sub-word of a first weight form a first partial product and a second partial product; and adding the first partial product and the second partial product. The forming of the second set of products may include: multiplying, in the first multiplier, a second activation value by a first sub-word and a second sub-word of a mantissa to form a third partial product and a fourth partial product; and adding the third partial product and the fourth partial product.

12.

发明授权
Runtime reconfigurable compression format conversion with bit-plane granularity 有权

公开(公告)号：US12231152B2

公开(公告)日：2025-02-18

申请号：US18096557

申请日：2023-01-12

Applicant: Samsung Electronics Co., Ltd.

Inventor： Jong Hoon Shin , Ardavan Pedram , Joseph Hassoun

IPC: H03M7/30 , G06F15/78 , G06N3/0495 , G06N3/063 , H03M7/42 , H03M7/46

Abstract: A runtime bit-plane data-format optimizer for a processing element includes a sparsity-detector and a compression-converter. The sparsity-detector selects a bit-plane compression-conversion format during a runtime of the processing element using a performance model that is based on a first sparsity pattern of first bit-plane data stored in a memory exterior to the processing element and a second sparsity pattern of second bit-plane data that is to be stored in a memory within the processing element. The second sparsity pattern is based on a runtime configuration of the processing element. The first bit-plane data is stored using a first bit-plane compression format and the bit-plane second data is to be stored using a second bit-plane compression format. The compression-conversion circuit converts the first bit-plane compression format of the first data to be the second bit-plane compression format of the second data.

13.

发明申请
ACCELERATING 2D CONVOLUTIONAL LAYER MAPPING ON A DOT PRODUCT ARCHITECTURE 有权

公开(公告)号：US20250028505A1

公开(公告)日：2025-01-23

申请号：US18908555

申请日：2024-10-07

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06F7/544 , G06F9/30 , G06N3/063

Abstract: A method for performing a convolution operation includes storing, a convolution kernel in a first storage device, the convolution kernel having dimensions x by y; storing, in a second storage device, a first subset of element values of an input feature map having dimensions n by m; performing a first simultaneous multiplication, of each value of the first subset of element values of the input feature map with a first element value from among the x*y elements of the convolution kernel; for each remaining value of the x*y elements of the convolution kernel, performing, a simultaneous multiplication of the remaining value with a corresponding subset of element values of the input feature map; for each simultaneous multiplication, storing, result of the simultaneous multiplication in an accumulator; and outputting, the values of the accumulator as a first row of an output feature map.

14.

发明授权
Low overhead implementation of Winograd for CNN with 3x3, 1x3 and 3x1 filters on weight station dot-product based CNN accelerators 有权

公开(公告)号：US12158923B2

公开(公告)日：2024-12-03

申请号：US16898422

申请日：2020-06-10

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06F17/14 , G06F17/15 , G06N3/04 , G06N3/045

Abstract: A system and a method are disclosed for forming an output feature map (OFM). Activation values in an input feature map (IFM) are selected and transformed on-the-fly into the Winograd domain. Elements in a Winograd filter is selected that respectively correspond to the transformed activation values. A transformed activation value is multiplied by a corresponding element of the Winograd filter to form a corresponding product value in the Winograd domain. Activation values are repeatedly selected, transformed and multiplied by a corresponding element in the Winograd filter to form corresponding product values in the Winograd domain until all activation values in the IFM have been transformed and multiplied by the corresponding element. The product values are summed in the Winograd domain to form elements of a feature map in the Winograd domain. The elements of the feature map in the Winograd domain are inverse-Winograd transformed on-the-fly to form the OFM.

15.

发明申请
CIRCUIT FOR HANDLING PROCESSING WITH OUTLIERS 有权

公开(公告)号：US20220414421A1

公开(公告)日：2022-12-29

申请号：US17493492

申请日：2021-10-04

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ali Shafiee Ardestani , Hamzah Ahmed Ali Abdelaziz , Joseph Hassoun

IPC: G06N3/04 , G06F7/544 , G06F7/523 , G06F7/501

Abstract: A system and method for handling processing with outliers. In some embodiments, the method includes: reading a first activation and a second activation, each including a least significant part and a most significant part, multiplying a first weight and a second weight by the respective activations, the multiplying of the first weight by the first activation including multiplying the first weight by the least significant part of the first activation in a first multiplier, the multiplying of the second weight by the second activation including: multiplying the second weight by the least significant part of the second activation in a second multiplier, and multiplying the second weight by the most significant part of the second activation in a shared multiplier, the shared multiplier being associated with a plurality of rows of an array of activations.

16.

发明申请
ACCELERATING 2D CONVOLUTIONAL LAYER MAPPING ON A DOT PRODUCT ARCHITECTURE 有权

公开(公告)号：US20210182025A1

公开(公告)日：2021-06-17

申请号：US16900819

申请日：2020-06-12

Applicant: SAMSUNG ELECTRONICS CO., LTD.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06F7/544 , G06F9/30 , G06N3/063

Abstract: A method for performing a convolution operation includes storing, a convolution kernel in a first storage device, the convolution kernel having dimensions x by y; storing, in a second storage device, a first subset of element values of an input feature map having dimensions n by m; performing a first simultaneous multiplication, of each value of the first subset of element values of the input feature map with a first element value from among the x*y elements of the convolution kernel; for each remaining value of the x*y elements of the convolution kernel, performing, a simultaneous multiplication of the remaining value with a corresponding subset of element values of the input feature map; for each simultaneous multiplication, storing, result of the simultaneous multiplication in an accumulator; and outputting, the values of the accumulator as a first row of an output feature map.

17.

发明授权
System and method for increasing utilization of dot-product based neural network accelerator 有权

公开(公告)号：US12136031B2

公开(公告)日：2024-11-05

申请号：US18320133

申请日：2023-05-18

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06F7/22 , G06F7/24 , G06F7/544 , G06F9/50 , G06N3/063 , G06N5/04

Abstract: A method of flattening channel data of an input feature map in an inference system includes retrieving pixel values of a channel of a plurality of channels of the input feature map from a memory and storing the pixel values in a buffer, extracting first values of a first region having a first size from among the pixel values stored in the buffer, the first region corresponding to an overlap region of a kernel of the inference system with channel data of the input feature map, rearranging second values corresponding to the overlap region of the kernel from among the first values in the first region, and identifying a first group of consecutive values from among the rearranged second values for supplying to a first dot-product circuit of the inference system.

18.

发明授权
Processor for fine-grain sparse integer and floating-point operations 有权

公开(公告)号：US11861327B2

公开(公告)日：2024-01-02

申请号：US17131357

申请日：2020-12-22

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ali Shafiee Ardestani , Joseph Hassoun

IPC: G06F7/487 , G06F7/485 , G06F7/544 , G06F7/483 , G06N3/063

CPC classification number: G06F7/4876 , G06F7/485 , G06F7/4836 , G06F7/5443 , G06N3/063

Abstract: A processor for fine-grain sparse integer and floating-point operations and method of operation thereof are provided. In some embodiments, the method includes forming a first set of products and forming a second set of products. The forming of the first set of products may include: multiplying, in a first multiplier, a first activation value by a least significant sub-word and a most significant sub-word of a first weight to form a first partial product and a second partial product; and adding the first partial product and the second partial product. The forming of the second set of products may include: multiplying, in the first multiplier, a second activation value by a first sub-word and a second sub-word of a mantissa to form a third partial product and a fourth partial product; and adding the third partial product and the fourth partial product.

19.

发明授权
Signed multiplication using unsigned multiplier with dynamic fine-grained operand isolation 有权

公开(公告)号：US11579842B2

公开(公告)日：2023-02-14

申请号：US17151115

申请日：2021-01-15

Applicant: Samsung Electronics Co., Ltd.

Inventor： Ilia Ovsiannikov , Ali Shafiee Ardestani , Joseph Hassoun , Lei Wang

IPC: G06F7/487 , G06F7/523 , G06F9/30

Abstract: An N×N multiplier may include a N/2×N first multiplier, a N/2×N/2 second multiplier, and a N/2×N/2 third multiplier. The N×N multiplier receives two operands to multiply. The first, second and/or third multipliers are selectively disabled if an operand equals zero or has a small value. If the operands are both less than 2N/2, the second or the third multiplier are used to multiply the operands. If one operand is less than 2N/2 and the other operand is equal to or greater than 2N/2, the first multiplier is used or the second and third multipliers are used to multiply the operands. If both operands are equal to or greater than 2N/2, the first, second and third multipliers are used to multiply the operands.

20.

发明申请
SYSTEM AND METHOD FOR PERFORMING COMPUTATIONS FOR DEEP NEURAL NETWORKS 有权

公开(公告)号：US20210326686A1

公开(公告)日：2021-10-21

申请号：US16900845

申请日：2020-06-12

Applicant: Samsung Electronics Co., Ltd.

Inventor： Hamzah Abdelaziz , Joseph Hassoun , Ali Shafiee Ardestani

IPC: G06N3/063 , G06N3/08 , G06N3/04 , G06F9/30 , G06F9/54

Abstract: A computation unit for performing a computation of a neural network layer is disclosed. A number of processing element (PE) units are arranged in an array. First input values are provided in parallel in an input dimension of the array during a first processing period, and a second input values are provided in parallel in the input dimension during a second processing period. Computations are performed by the PE units based on stored weight values. An adder coupled to the first set of PE units generates a first sum of results of the computations by the first set of PE units during the first processing cycle, and generates a second sum of results of the computations during the second processing cycle. A first accumulator coupled to the first adder stores the first sum, and further shifts the first sum to a second accumulator prior to storing the second sum.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification