Patent search ap:("INTEL CORPORATION") AND inv:"Srinivasan Narayanamoorthy" Page 1

1.

发明申请
SYSTOLIC ARRAY ACCELERATOR SYSTEMS AND METHODS 审中-公开

公开(公告)号：US20200272596A1

公开(公告)日：2020-08-27

申请号：US16283795

申请日：2019-02-24

Applicant: INTEL CORPORATION

Inventor： Srinivasan Narayanamoorthy , Jayaram Bobba , Ankit More

IPC: G06F15/80

Abstract: The present disclosure is directed to systems and methods for decomposing systolic array circuitry to provide a plurality of N×N systolic sub-array circuits, apportioning a first tensor or array into a plurality of N×M first input arrays, and apportioning a second tensor or array into a plurality of M×N second input arrays. Systolic array control circuitry transfers corresponding ones of the first input arrays and second input arrays to a respective one of the plurality of N×N systolic sub-array circuits. As the elements included in the first input array and the elements included in the second input array are transferred to the systolic sub-array, the systolic sub-array performs one or more mathematical operations using the first and the second input arrays. The systems and methods beneficially improve the usage of the systolic array circuitry thereby advantageously reducing the number of clock cycles needed to perform a given number of calculations.

2.

发明授权
Accelerator for sparse-dense matrix multiplication 有权

公开(公告)号：US11829440B2

公开(公告)日：2023-11-28

申请号：US17229550

申请日：2021-04-13

Applicant: Intel Corporation

Inventor： Srinivasan Narayanamoorthy , Nadathur Rajagopalan Satish , Alexey Suprun , Kenneth J. Janik

IPC: G06F17/16 , G06F7/544 , G06F9/38 , G06F9/30 , G06N3/00

CPC classification number: G06F17/16 , G06F7/5443 , G06F9/3001 , G06F9/3016 , G06F9/30036 , G06F9/30145 , G06F9/383 , G06F9/3887 , G06N3/00

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

3.

发明授权
Systolic array accelerator systems and methods 有权

公开(公告)号：US11003619B2

公开(公告)日：2021-05-11

申请号：US16283795

申请日：2019-02-24

Applicant: INTEL CORPORATION

Inventor： Srinivasan Narayanamoorthy , Jayaram Bobba , Ankit More

IPC: G06F15/80 , G06F17/16

Abstract: The present disclosure is directed to systems and methods for decomposing systolic array circuitry to provide a plurality of N×N systolic sub-array circuits, apportioning a first tensor or array into a plurality of N×M first input arrays, and apportioning a second tensor or array into a plurality of M×N second input arrays. Systolic array control circuitry transfers corresponding ones of the first input arrays and second input arrays to a respective one of the plurality of N×N systolic sub-array circuits. As the elements included in the first input array and the elements included in the second input array are transferred to the systolic sub-array, the systolic sub-array performs one or more mathematical operations using the first and the second input arrays. The systems and methods beneficially improve the usage of the systolic array circuitry thereby advantageously reducing the number of clock cycles needed to perform a given number of calculations.

4.

发明授权
Accelerator for sparse-dense matrix multiplication 有权

公开(公告)号：US10984074B2

公开(公告)日：2021-04-20

申请号：US16799586

申请日：2020-02-24

Applicant: Intel Corporation

Inventor： Srinivasan Narayanamoorthy , Nadathur Rajagopalan Satish , Alexey Suprun , Kenneth J. Janik

IPC: G06F17/16 , G06F7/544 , G06F9/38 , G06F9/30 , G06N3/00

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

5.

发明授权
Accelerator for sparse-dense matrix multiplication 有权

公开(公告)号：US10572568B2

公开(公告)日：2020-02-25

申请号：US15938924

申请日：2018-03-28

Applicant: Intel Corporation

Inventor： Srinivasan Narayanamoorthy , Nadathur Rajagopalan Satish , Alexey Suprun , Kenneth J. Janik

IPC: G06F17/16 , G06F7/544 , G06F9/38 , G06F9/30

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification