Patent search ap:("Intel Corporation") AND inv:"Amir Khosrowshahi" Page 1

1.

发明授权
Distributed matrix multiplication for neural networks 有权

公开(公告)号：US10169296B2

公开(公告)日：2019-01-01

申请号：US15395527

申请日：2016-12-30

Applicant: Intel Corporation

Inventor： Vijay Anand R. Korthikanti , Carey K. Kloss , Aravind Kalaiah , Amir Khosrowshahi

IPC: G06F17/16 , G06N3/08

Abstract: In one embodiment, a matrix operation associated with a plurality of input matrices may be performed. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.

2.

发明申请
DYNAMIC MANAGEMENT OF NUMERICAL REPRESENTATION IN A DISTRIBUTED MATRIX PROCESSOR ARCHITECTURE 审中-公开

公开(公告)号：US20170316307A1

公开(公告)日：2017-11-02

申请号：US15143293

申请日：2016-04-29

Applicant: Intel Corporation

Inventor： Urs Koster , William Howard Constable , Luke James Hornof , Carey Kevin Kloss , Amir Khosrowshahi , Scott Gray

IPC: G06N3/04 , G06N3/08 , G06F17/18

Abstract: A system receives and executes a sequence of tensor instructions, for example, instructions for performing a neural network computation. The system may be implemented as a multiprocessor architecture, for example, hardware for performing a neural network computation. A tensor instruction specifies a tensor computation receiving one or more input tensors for determining an output tensor. The system stores a decimal position associated with a plurality of values of a tensor. The system performs the tensor computation of a tensor instruction to determine a plurality of values of the output tensor. The system collects statistics describing the plurality of values of the output tensor and determines a decimal position for the plurality of values based on the collected statistics.

3.

发明申请
DISTRIBUTED CONVOLUTION FOR NEURAL NETWORKS 有权

公开(公告)号：US20220121954A1

公开(公告)日：2022-04-21

申请号：US17564098

申请日：2021-12-28

Applicant: Intel Corporation

Inventor： Vijay Anand R. Korthikanti , Aravind Kalaiah , Tony L. Werner , Carey K. Kloss , Amir Khosrowshahi

IPC: G06N3/08 , G06F17/16 , G06F17/15 , G06N3/063 , G06N3/04

Abstract: In one embodiment, a matrix operation may be performed using a plurality of input matrices, wherein the matrix operation is associated with one or more convolution operations. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.

4.

发明授权
Optical analog matrix multiplier for optical neural networks 有权

公开(公告)号：US11218223B2

公开(公告)日：2022-01-04

申请号：US16950819

申请日：2020-11-17

Applicant: Intel Corporation

Inventor： Wenhua Lin , Amir Khosrowshahi , Casimir Wierzynski

IPC: H04B10/40 , H04B10/58 , G06N3/04 , G02F1/35 , H04B10/516 , H04B10/61

Abstract: Embodiments of the present disclosure are directed toward techniques and apparatus comprising at least one layer of an ONN that includes an optical matrix multiplier provided in a semiconductor substrate to receive a plurality of optical signal inputs and to linearly transform the plurality of optical signal inputs into a plurality of optical signal outputs. The optical matrix multiplier comprises one or more 2×2 unitary optical matrices optically interconnected to implement a singular value decomposition (SVD) of a matrix, and a nonlinear optical device coupled with the optical matrix multiplier in the semiconductor substrate, to receive the optical signal outputs and to provide an optical output that is generated in a nonlinear manner in response to the optical signal outputs of the optical matrix multiplier reaching saturation or attenuation. Additional embodiments may be described and claimed.

5.

发明申请
HIGH EFFICIENCY OPTICAL NEURAL NETWORK 有权

公开(公告)号：US20210133547A1

公开(公告)日：2021-05-06

申请号：US16950821

申请日：2020-11-17

Applicant: Intel Corporation

Inventor： Wenhua Lin , Amir Khosrowshahi , Casimir Wierzynski

IPC: G06N3/067 , G06F17/16

Abstract: Techniques and configurations for an optical neural network (ONN) with layers of optical matrix multipliers and an optical nonlinearity function are described herein. The techniques provide for programmable matrix multipliers, allowing for a partitioned use of a part of a matrix as needed, for computation efficiency. The techniques provide for multiple pass-through the same optical matrix die on the same photonic integrated circuit (PIC) chip and for connecting multiple layers of the ONN and running through them in sequence. The techniques further provide for scaling the ONN to different sizes. Additional embodiments may be described and claimed.

6.

发明申请
DISTRIBUTED CONVOLUTION FOR NEURAL NETWORKS 审中-公开

公开(公告)号：US20180189652A1

公开(公告)日：2018-07-05

申请号：US15395675

申请日：2016-12-30

Applicant: Intel Corporation

Inventor： Vijay Anand R. Korthikanti , Aravind Kalaiah , Tony L. Werner , Carey K. Kloss , Amir Khosrowshahi

IPC: G06N3/08 , G06F17/16 , G06N3/04

CPC classification number: G06N3/084 , G06F17/153 , G06F17/16 , G06N3/0454 , G06N3/063

Abstract: In one embodiment, a matrix operation may be performed using a plurality of input matrices, wherein the matrix operation is associated with one or more convolution operations. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.

7.

发明申请
PIPELINED CONVOLUTIONAL OPERATIONS FOR PROCESSING CLUSTERS 有权

公开(公告)号：US20170097884A1

公开(公告)日：2017-04-06

申请号：US14874784

申请日：2015-10-05

Applicant: Intel Corporation

Inventor： Tony Werner , Aravind Kalaiah , Andrew Yang , Carey Kloss , Horace Lau , Naveen Gandham Rao , Amir Khosrowshahi

IPC: G06F12/02 , G06F9/30

CPC classification number: G06F12/023 , G06F15/76 , G06F2212/251 , G06T1/20

Abstract: Described herein are one or more integrated circuits (ICs) comprising controller circuitry to receive a command to execute an operation for data inputs stored in an external memory or a local memory, and convert the operation into a set of matrix operations to operate on sub-portions of the data inputs. The IC(s) further comprise at least one processing circuitry to execute the set of matrix operations, the processing circuitry to include ALUs, a local memory external to the ALUs and accessible by the ALUs, and processing control circuitry to create at least one matrix operand in the local memory (from the data inputs of the operation) comprising at least one of a scalar, a vector, or a 2D matrix, and provide memory handles corresponding to each of the matrix operands to one of the ALUs to access the respective matrix operands when executing a matrix operation.

8.

发明申请
OPTICAL ANALOG MATRIX MULTIPLIER FOR OPTICAL NEURAL NETWORKS 有权

公开(公告)号：US20210135764A1

公开(公告)日：2021-05-06

申请号：US16950819

申请日：2020-11-17

Applicant: Intel Corporation

Inventor： Wenhua Lin , Amir Khosrowshahi , Casimir Wierzynski

IPC: H04B10/58 , G06N3/04 , H04B10/40 , H04B10/516 , H04B10/61 , G02F1/35

Abstract: Embodiments of the present disclosure are directed toward techniques and apparatus comprising at least one layer of an ONN that includes an optical matrix multiplier provided in a semiconductor substrate to receive a plurality of optical signal inputs and to linearly transform the plurality of optical signal inputs into a plurality of optical signal outputs. The optical matrix multiplier comprises one or more 2×2 unitary optical matrices optically interconnected to implement a singular value decomposition (SVD) of a matrix, and a nonlinear optical device coupled with the optical matrix multiplier in the semiconductor substrate, to receive the optical signal outputs and to provide an optical output that is generated in a nonlinear manner in response to the optical signal outputs of the optical matrix multiplier reaching saturation or attenuation. Additional embodiments may be described and claimed.

9.

发明授权
Dimension shuffling using matrix processors 有权

公开(公告)号：US10949496B2

公开(公告)日：2021-03-16

申请号：US15395906

申请日：2016-12-30

Applicant: Intel Corporation

Inventor： Vijay Anand R. Korthikanti , Aravind Kalaiah , Tony L. Werner , Amir Khosrowshahi

IPC: G06F17/16 , G06F9/30 , G06F7/78

Abstract: In one embodiment, a matrix operation may be performed to reorder a plurality of dimensions of an input matrix stored in two-dimensional memory. Data associated with the input matrix may be accessed using one or more strided memory operations, wherein the one or more strided memory operations are configured to access the two-dimensional memory at a plurality of locations that are separated by a particular interval. The data accessed using the one or more strided memory operations may be stored in a result matrix, wherein the data accessed using each strided memory operation is stored in the result matrix in non-transpose form or transpose form.

10.

发明申请
DISTRIBUTED MATRIX MULTIPLICATION FOR NEURAL NETWORKS 审中-公开

公开(公告)号：US20190138569A1

公开(公告)日：2019-05-09

申请号：US16236955

申请日：2018-12-31

Applicant: Intel Corporation

Inventor： Vijay Anand R. Korthikanti , Carey K. Kloss , Aravind Kalaiah , Amir Khosrowshahi

IPC: G06F17/16 , G06N3/08

Abstract: In one embodiment, a matrix operation associated with a plurality of input matrices may be performed. The plurality of input matrices may be partitioned into a plurality of input partitions, wherein the plurality of input matrices is partitioned based on a number of available processing elements. The plurality of input partitions may be distributed among a plurality of processing elements, wherein each input partition is distributed to a particular processing element of the plurality of processing elements. A plurality of partial matrix operations may be performed using the plurality of processing elements, and partial matrix data may be transmitted between the plurality of processing elements while performing the plurality of partial matrix operations. A result of the matrix operation may be determined based on the plurality of partial matrix operations.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification