Patent search ap:("INTEL CORPORATION") AND inv:"Nitin Garegrat" Page 1

1.

发明申请
APPARATUS AND METHOD FOR COHERENT, ACCELERATED CONVERSION BETWEEN DATA REPRESENTATIONS 审中-公开

公开(公告)号：US20190042094A1

公开(公告)日：2019-02-07

申请号：US16024812

申请日：2018-06-30

Applicant: INTEL CORPORATION

Inventor： Krishnakumar Nair , Andrew Yang , Michael Rotzn , Nitin Garegrat , Tom Schebye , Tony Werner

IPC: G06F3/06 , G06F9/30

Abstract: An apparatus and method for a converting tensor data. For example, one embodiment of a method comprises: fetching source tensor blocks of a source tensor data structure, each source tensor block comprising a plurality of source tensor data elements having a first numeric representation, wherein the source tensor data structure comprises a predefined structural arrangement of source tensor blocks; converting the one or more source tensor blocks into one or more destination tensor blocks comprising a plurality of destination tensor data elements having a second numeric representation different from the first numeric representation, wherein the sets of one or more source tensor blocks are converted to one or more corresponding destination tensor blocks in a specified order based on the first and second numeric representations; and storing each individual destination tensor block in a designated memory region to maintain coherency with the predefined structural arrangement of the source tensor blocks.

2.

发明授权
Apparatus and method for a masked multiply instruction to support neural network pruning operations 有权

公开(公告)号：US10929503B2

公开(公告)日：2021-02-23

申请号：US16230814

申请日：2018-12-21

Applicant: Intel Corporation

Inventor： Omid Azizi , Chen Koren , Nitin Garegrat

IPC: G06F17/16 , G06N3/02 , G06F9/30

Abstract: An apparatus and method for a masked multiply instruction to support neural network pruning operations. For example, one embodiment of a processor comprises: a decoder to decode a matrix multiplication with masking (GEMM) instruction identifying a destination matrix register to store a result, and source registers storing an A-matrix, a B-matrix, and a matrix mask; execution circuitry to execute the GEMM instruction, the execution circuitry to multiply a plurality of B-matrix elements with a plurality of A-matrix elements, each of the B-matrix elements associated with a mask value in the matrix mask, wherein if the mask value is set to a first value, then the execution circuitry is to multiply the B-matrix element with one or more of the A-matrix elements to generate a first partial result, and if the mask value is set to a second value, then the execution circuitry is to multiply an alternate B-matrix element with a one or more of the A-matrix elements to generate a second partial result.

3.

发明授权
Apparatus and method for coherent, accelerated conversion between data representations 有权

公开(公告)号：US10761757B2

公开(公告)日：2020-09-01

申请号：US16024812

申请日：2018-06-30

Applicant: INTEL CORPORATION

Inventor： Krishnakumar Nair , Andrew Yang , Michael Rotzin , Nitin Garegrat , Tom Schebye , Tony Werner

IPC: G06F3/06 , G06F9/30 , G06N3/08

Abstract: An apparatus and method for a converting tensor data. For example, one embodiment of a method comprises: fetching source tensor blocks of a source tensor data structure, each source tensor block comprising a plurality of source tensor data elements having a first numeric representation, wherein the source tensor data structure comprises a predefined structural arrangement of source tensor blocks; converting the one or more source tensor blocks into one or more destination tensor blocks comprising a plurality of destination tensor data elements having a second numeric representation different from the first numeric representation, wherein the sets of one or more source tensor blocks are converted to one or more corresponding destination tensor blocks in a specified order based on the first and second numeric representations; and storing each individual destination tensor block in a designated memory region to maintain coherency with the predefined structural arrangement of the source tensor blocks.

Patent Agency Ranking