Patent search ap:("Intel Corporation") AND inv:"Abhisek KUNDU" Page 1

1.

发明申请
INSTRUCTIONS FOR FUSED MULTIPLY-ADD OPERATIONS WITH VARIABLE PRECISION INPUT OPERANDS 审中-公开

公开(公告)号：US20190042242A1

公开(公告)日：2019-02-07

申请号：US15940774

申请日：2018-03-29

Applicant: Intel Corporation

Inventor： Dipankar DAS , Naveen K. MELLEPUDI , Mrinmay DUTTA , Arun KUMAR , Dheevatsa MUDIGERE , Abhisek KUNDU

IPC: G06F9/30 , G06F7/544 , G06F9/38

Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.

2.

发明公开
INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION 审中-公开

公开(公告)号：US20240160931A1

公开(公告)日：2024-05-16

申请号：US18532795

申请日：2023-12-07

Applicant: Intel Corporation

Inventor： Abhisek KUNDU , NAVEEN MELLEMPUDI , DHEEVATSA MUDIGERE , Dipankar DAS

IPC: G06N3/08 , G06F9/46 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06N5/04 , G06T15/00

CPC classification number: G06N3/08 , G06F9/46 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06N5/04 , G06T15/005 , G06T17/20

Abstract: One embodiment provides for a computer-readable medium storing instructions that cause one or more processors to perform operations comprising determining a per-layer scale factor to apply to tensor data associated with layers of a neural network model and converting the tensor data to converted tensor data. The tensor data may be converted from a floating point datatype to a second datatype that is an 8-bit datatype. The instructions further cause the one or more processors to generate an output tensor based on the converted tensor data and the per-layer scale factor.

3.

发明申请
INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION 有权

公开(公告)号：US20230087364A1

公开(公告)日：2023-03-23

申请号：US18060414

申请日：2022-11-30

Applicant: Intel Corporation

Inventor： Abhisek KUNDU , NAVEEN MELLEMPUDI , DHEEVATSA MUDIGERE , Dipankar DAS

IPC: G06N3/08 , G06T15/00 , G06N5/04 , G06N3/04 , G06F9/46 , G06N3/063

Abstract: One embodiment provides for a computer-readable medium storing instructions that cause one or more processors to perform operations comprising determining a per-layer scale factor to apply to tensor data associated with layers of a neural network model and converting the tensor data to converted tensor data. The tensor data may be converted from a floating point datatype to a second datatype that is an 8-bit datatype. The instructions further cause the one or more processors to generate an output tensor based on the converted tensor data and the per-layer scale factor.

4.

发明公开
INSTRUCTIONS FOR FUSED MULTIPLY-ADD OPERATIONS WITH VARIABLE PRECISION INPUT OPERANDS 审中-公开

公开(公告)号：US20240126544A1

公开(公告)日：2024-04-18

申请号：US18399578

申请日：2023-12-28

Applicant: Intel Corporation

Inventor： Dipankar DAS , Naveen K. MELLEMPUDI , Mrinmay DUTTA , Arun KUMAR , Dheevatsa MUDIGERE , Abhisek KUNDU

IPC: G06F9/30 , G06F7/483 , G06F7/544 , G06F9/38 , G06N3/063

CPC classification number: G06F9/30014 , G06F7/483 , G06F7/5443 , G06F9/30036 , G06F9/30145 , G06F9/3802 , G06F9/382 , G06F9/384 , G06F9/3887 , G06N3/063 , G06F9/30065 , G06F2207/382

Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.

5.

发明申请
INSTRUCTIONS FOR FUSED MULTIPLY-ADD OPERATIONS WITH VARIABLE PRECISION INPUT OPERANDS 有权

公开(公告)号：US20220214877A1

公开(公告)日：2022-07-07

申请号：US17704690

申请日：2022-03-25

Applicant: Intel Corporation

Inventor： Dipankar DAS , Naveen K. MELLEMPUDI , Mrinmay DUTTA , Arun KUMAR , Dheevatsa MUDIGERE , Abhisek KUNDU

IPC: G06F9/30 , G06F7/544 , G06F9/38 , G06N3/063 , G06F7/483

Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.

6.

发明申请
INSTRUCTIONS FOR FUSED MULTIPLY-ADD OPERATIONS WITH VARIABLE PRECISION INPUT OPERANDS 审中-公开

公开(公告)号：US20200257527A1

公开(公告)日：2020-08-13

申请号：US16735381

申请日：2020-01-06

Applicant: Intel Corporation

Inventor： Dipankar DAS , Naveen K. MELLEMPUDI , Mrinmay DUTTA , Arun KUMAR , Dheevatsa MUDIGERE , Abhisek KUNDU

IPC: G06F9/30 , G06F7/483 , G06N3/063 , G06F9/38 , G06F7/544

Abstract: Disclosed embodiments relate to instructions for fused multiply-add (FMA) operations with variable-precision inputs. In one example, a processor to execute an asymmetric FMA instruction includes fetch circuitry to fetch an FMA instruction having fields to specify an opcode, a destination, and first and second source vectors having first and second widths, respectively, decode circuitry to decode the fetched FMA instruction, and a single instruction multiple data (SIMD) execution circuit to process as many elements of the second source vector as fit into an SIMD lane width by multiplying each element by a corresponding element of the first source vector, and accumulating a resulting product with previous contents of the destination, wherein the SIMD lane width is one of 16 bits, 32 bits, and 64 bits, the first width is one of 4 bits and 8 bits, and the second width is one of 1 bit, 2 bits, and 4 bits.

7.

发明申请
INCREMENTAL PRECISION NETWORKS USING RESIDUAL INFERENCE AND FINE-GRAIN QUANTIZATION 审中-公开

公开(公告)号：US20180314940A1

公开(公告)日：2018-11-01

申请号：US15869515

申请日：2018-01-12

Applicant: Intel Corporation

Inventor： Abhisek KUNDU , NAVEEN MELLEMPUDI , DHEEVATSA MUDIGERE , Dipankar DAS

IPC: G06N3/08 , G06T15/00 , G06N5/04

Abstract: One embodiment provides for a computing device comprising a parallel processor compute unit to perform a set of parallel integer compute operations; a ternarization unit including a weight ternarization circuit and an activation quantization circuit; wherein the weight ternarization circuit is to convert a weight tensor from a floating-point representation to a ternary representation including a ternary weight and a scale factor; wherein the activation quantization circuit is to convert an activation tensor from a floating-point representation to an integer representation; and wherein the parallel processor compute unit includes one or more circuits to perform the set of parallel integer compute operations on the ternary representation of the weight tensor and the integer representation of the activation tensor.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification