Patent search ap:("INTEL CORPORATION") AND inv:"Tony Werner" Page 1

1.

发明申请
PIPELINED CONVOLUTIONAL OPERATIONS FOR PROCESSING CLUSTERS 有权

公开(公告)号：US20170097884A1

公开(公告)日：2017-04-06

申请号：US14874784

申请日：2015-10-05

Applicant: Intel Corporation

Inventor： Tony Werner , Aravind Kalaiah , Andrew Yang , Carey Kloss , Horace Lau , Naveen Gandham Rao , Amir Khosrowshahi

IPC: G06F12/02 , G06F9/30

CPC classification number: G06F12/023 , G06F15/76 , G06F2212/251 , G06T1/20

Abstract: Described herein are one or more integrated circuits (ICs) comprising controller circuitry to receive a command to execute an operation for data inputs stored in an external memory or a local memory, and convert the operation into a set of matrix operations to operate on sub-portions of the data inputs. The IC(s) further comprise at least one processing circuitry to execute the set of matrix operations, the processing circuitry to include ALUs, a local memory external to the ALUs and accessible by the ALUs, and processing control circuitry to create at least one matrix operand in the local memory (from the data inputs of the operation) comprising at least one of a scalar, a vector, or a 2D matrix, and provide memory handles corresponding to each of the matrix operands to one of the ALUs to access the respective matrix operands when executing a matrix operation.

2.

发明授权
Matrix operands for linear algebra operations 有权

公开(公告)号：US09886418B2

公开(公告)日：2018-02-06

申请号：US14697728

申请日：2015-04-28

Applicant: Intel Corporation

Inventor： Andrew Yang , Carey Kloss , Prashant Arora , Tony Werner , Naveen Gandham Rao , Amir Khosrowshahi

IPC: G06F12/00 , G06F17/16 , G06F12/02 , G06N3/08

CPC classification number: G06F17/16 , G06F12/023 , G06F2212/251 , G06N3/08

Abstract: Described herein are methods, systems, and apparatuses to utilize a matrix operation by accessing each of the operation's matrix operands via a respective single memory handle. This use of a single memory handle for each matrix operand eliminates significant overhead in memory allocation, data tracking, and subroutine complexity present in prior art solutions. The result of the matrix operation can also be accessible via a single memory handle identifying the matrix elements of the result.

3.

发明授权
Apparatus and method for coherent, accelerated conversion between data representations 有权

公开(公告)号：US10761757B2

公开(公告)日：2020-09-01

申请号：US16024812

申请日：2018-06-30

Applicant: INTEL CORPORATION

Inventor： Krishnakumar Nair , Andrew Yang , Michael Rotzin , Nitin Garegrat , Tom Schebye , Tony Werner

IPC: G06F3/06 , G06F9/30 , G06N3/08

Abstract: An apparatus and method for a converting tensor data. For example, one embodiment of a method comprises: fetching source tensor blocks of a source tensor data structure, each source tensor block comprising a plurality of source tensor data elements having a first numeric representation, wherein the source tensor data structure comprises a predefined structural arrangement of source tensor blocks; converting the one or more source tensor blocks into one or more destination tensor blocks comprising a plurality of destination tensor data elements having a second numeric representation different from the first numeric representation, wherein the sets of one or more source tensor blocks are converted to one or more corresponding destination tensor blocks in a specified order based on the first and second numeric representations; and storing each individual destination tensor block in a designated memory region to maintain coherency with the predefined structural arrangement of the source tensor blocks.

4.

发明授权
Pipelined convolutional operations for processing clusters 有权

公开(公告)号：US09886377B2

公开(公告)日：2018-02-06

申请号：US14874784

申请日：2015-10-05

Applicant: Intel Corporation

Inventor： Tony Werner , Aravind Kalaiah , Andrew Yang , Carey Kloss , Horace Lau , Naveen Gandham Rao , Amir Khosrowshahi

IPC: G06F12/00 , G06F12/02 , G06F15/76 , G06T1/20

CPC classification number: G06F12/023 , G06F15/76 , G06F2212/251 , G06T1/20

Abstract: Described herein are one or more integrated circuits (ICs) comprising controller circuitry to receive a command to execute an operation for data inputs stored in an external memory or a local memory, and convert the operation into a set of matrix operations to operate on sub-portions of the data inputs. The IC(s) further comprise at least one processing circuitry to execute the set of matrix operations, the processing circuitry to include ALUs, a local memory external to the ALUs and accessible by the ALUs, and processing control circuitry to create at least one matrix operand in the local memory (from the data inputs of the operation) comprising at least one of a scalar, a vector, or a 2D matrix, and provide memory handles corresponding to each of the matrix operands to one of the ALUs to access the respective matrix operands when executing a matrix operation.

5.

发明申请
DEEP LEARNING HARDWARE 审中-公开

公开(公告)号：US20190392297A1

公开(公告)日：2019-12-26

申请号：US16474029

申请日：2017-12-28

Applicant: Intel Corporation

Inventor： Horace H. Lau , Prashant Arora , Olivia K. Wu , Tony Werner , Carey K. Kloss , Amir Khosrowshahi , Andrew Yang , Aravind Kalaiah , Vijay Anand R. Korthikanti

IPC: G06N3/063 , G06F17/16 , G06N3/08 , G06N3/04

Abstract: A network of matrix processing units (MPUs) is provided on a device, where each MPU is connected to at least one other MPU in the network, and each MPU is to perform matrix multiplication operations. Computer memory stores tensor data and a master control central processing unit (MCC) is provided on the device to receive an instruction from a host device, where the instruction includes one or more tensor operands based on the tensor data. The MCC invokes a set of operations on one or more of the MPUs based on the instruction, where the set of operations includes operations on the tensor operands. A result is generated from the set of operations, the result embodied as a tensor value.

6.

发明申请
APPARATUS AND METHOD FOR COHERENT, ACCELERATED CONVERSION BETWEEN DATA REPRESENTATIONS 审中-公开

公开(公告)号：US20190042094A1

公开(公告)日：2019-02-07

申请号：US16024812

申请日：2018-06-30

Applicant: INTEL CORPORATION

Inventor： Krishnakumar Nair , Andrew Yang , Michael Rotzn , Nitin Garegrat , Tom Schebye , Tony Werner

IPC: G06F3/06 , G06F9/30

Abstract: An apparatus and method for a converting tensor data. For example, one embodiment of a method comprises: fetching source tensor blocks of a source tensor data structure, each source tensor block comprising a plurality of source tensor data elements having a first numeric representation, wherein the source tensor data structure comprises a predefined structural arrangement of source tensor blocks; converting the one or more source tensor blocks into one or more destination tensor blocks comprising a plurality of destination tensor data elements having a second numeric representation different from the first numeric representation, wherein the sets of one or more source tensor blocks are converted to one or more corresponding destination tensor blocks in a specified order based on the first and second numeric representations; and storing each individual destination tensor block in a designated memory region to maintain coherency with the predefined structural arrangement of the source tensor blocks.

7.

发明申请
MATRIX OPERANDS FOR LINEAR ALGEBRA OPERATIONS 有权
Title translation: 线性运算的矩阵运算

公开(公告)号：US20170060811A1

公开(公告)日：2017-03-02

申请号：US14697728

申请日：2015-04-28

Applicant: Intel Corporation

Inventor： Andrew Yang , Carey Kloss , Prashant Arora , Tony Werner , Naveen Gandham Rao , Amir Khosrowshahi

IPC: G06F17/16 , G06N3/08 , G06F12/02

CPC classification number: G06F17/16 , G06F12/023 , G06F2212/251 , G06N3/08

Abstract: Described herein are methods, systems, and apparatuses to utilize a matrix operation by accessing each of the operation's matrix operands via a respective single memory handle. This use of a single memory handle for each matrix operand eliminates significant overhead in memory allocation, data tracking, and subroutine complexity present in prior art solutions. The result of the matrix operation can also be accessible via a single memory handle identifying the matrix elements of the result.

Abstract translation: 这里描述了通过经由相应的单个存储器句柄访问每个操作的矩阵操作数来利用矩阵运算的方法，系统和装置。每个矩阵操作数使用单个存储器句柄消除了现有技术解决方案中存在的内存分配，数据跟踪和子程序复杂度方面的重大开销。矩阵运算的结果也可以通过识别结果的矩阵元素的单个存储器句柄来访问。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification