Patent search ap:("INTEL CORPORATION") AND inv:"Krishnakumar Nair" Page 1

1.

发明授权
Mechanism to perform non-linear functions in a machine learning accelerator 有权

公开(公告)号：US11640537B2

公开(公告)日：2023-05-02

申请号：US16378107

申请日：2019-04-08

Applicant: Intel Corporation

Inventor： Bharat Daga , Krishnakumar Nair , Pradeep Janedula , Aravind Babu Srinivasan , Bijoy Pazhanimala , Ambili Vengallur

IPC: G06N3/10

Abstract: An apparatus to facilitate execution of non-linear functions operations is disclosed. The apparatus comprises accelerator circuitry including a compute grid having a plurality of processing elements to execute neural network computations, store values resulting from the neural network computations, and perform piecewise linear (PWL) approximations of one or more non-linear functions using the stored values as input data.

2.

发明授权
Apparatuses and methods to accelerate matrix multiplication 有权

公开(公告)号：US12254061B2

公开(公告)日：2025-03-18

申请号：US17256195

申请日：2018-09-27

Applicant: Intel Corporation

Inventor： Maciej Urbanski , Brian J. Hickmann , Michael Rotzin , Krishnakumar Nair , Andrew Yang , Brian S. Morris , Dennis Bradford

IPC: G06F17/16 , G06F7/523 , G06F7/544

Abstract: Methods and apparatuses relating to performing vector multiplication are described. Hardware accelerators to perform vector multiplication are also described. In one embodiment, a combined fixed-point and floating-point vector multiplication circuit includes at least one switch to change the circuit between a first mode and a second mode, where in the first mode, each multiplier of a set of multipliers is to multiply mantissas from a same element position of a first floating-point vector and a second floating-point vector to produce a corresponding product, shift the corresponding products with a set of shift registers based on a maximum exponent of exponents for the corresponding products determined by a maximum exponent determiner to produce shifted products, perform an numeric conversion operation on the shifted products with a set of numeric conversion circuits based on sign bits from the same element position of the first floating-point vector and the second floating-point vector to produce signed representations of the shifted products, add the signed representations of the shifted products with a set of adders to produce a single product, and normalize the single product with a normalization circuit based on the maximum exponent into a single floating-point resultant, and in the second mode, each multiplier of the set of multipliers is to multiply values from a same element position of a first integer vector and a second integer vector to produce a corresponding product, and add each corresponding product with the set of adders to produce a single integer resultant.

3.

发明申请
MECHANISM TO PERFORM NON-LINEAR FUNCTIONS IN A MACHINE LEARNING ACCELERATOR 审中-公开

公开(公告)号：US20200320403A1

公开(公告)日：2020-10-08

申请号：US16378107

申请日：2019-04-08

Applicant: Intel Corporation

Inventor： Bharat Daga , Krishnakumar Nair , Pradeep Janedula , Aravind Babu Srinivasan , Bijoy Pazhanimala , Ambili Vengallur

IPC: G06N3/10

Abstract: An apparatus to facilitate execution of non-linear functions operations is disclosed. The apparatus comprises accelerator circuitry including a compute grid having a plurality of processing elements to execute neural network computations, store values resulting from the neural network computations, and perform piecewise linear (PWL) approximations of one or more non-linear functions using the stored values as input data.

4.

发明申请
ARTIFICIAL NEURAL NETWORK TRAINING USING FLEXIBLE FLOATING POINT TENSORS 审中-公开

公开(公告)号：US20190042944A1

公开(公告)日：2019-02-07

申请号：US16004243

申请日：2018-06-08

Applicant: Intel Corporation

Inventor： Krishnakumar Nair , Andrew Yang , Brian Morris

IPC: G06N3/08 , G06N3/04

Abstract: Thus, the present disclosure is directed to systems and methods for training neural networks using a tensor that includes a plurality of FP16 values and a plurality of bits that define an exponent shared by some or all of the FP16 values included in the tensor. The FP16 values may include IEEE 754 format 16-bit floating point values and the tensor may include a plurality of bits defining the shared exponent. The tensor may include a shared exponent and FP16 values that include a variable bit-length mantissa and a variable bit-length exponent that may be dynamically set by processor circuitry. The tensor may include a shared exponent and FP16 values that include a variable bit-length mantissa; a variable bit-length exponent that may be dynamically set by processor circuitry; and a shared exponent switch set by the processor circuitry to selectively combine the FP16 value exponent with the shared exponent.

5.

发明申请
APPARATUS AND METHOD FOR COHERENT, ACCELERATED CONVERSION BETWEEN DATA REPRESENTATIONS 审中-公开

公开(公告)号：US20190042094A1

公开(公告)日：2019-02-07

申请号：US16024812

申请日：2018-06-30

Applicant: INTEL CORPORATION

Inventor： Krishnakumar Nair , Andrew Yang , Michael Rotzn , Nitin Garegrat , Tom Schebye , Tony Werner

IPC: G06F3/06 , G06F9/30

Abstract: An apparatus and method for a converting tensor data. For example, one embodiment of a method comprises: fetching source tensor blocks of a source tensor data structure, each source tensor block comprising a plurality of source tensor data elements having a first numeric representation, wherein the source tensor data structure comprises a predefined structural arrangement of source tensor blocks; converting the one or more source tensor blocks into one or more destination tensor blocks comprising a plurality of destination tensor data elements having a second numeric representation different from the first numeric representation, wherein the sets of one or more source tensor blocks are converted to one or more corresponding destination tensor blocks in a specified order based on the first and second numeric representations; and storing each individual destination tensor block in a designated memory region to maintain coherency with the predefined structural arrangement of the source tensor blocks.

6.

发明授权
Artificial neural network training using flexible floating point tensors 有权

公开(公告)号：US12205035B2

公开(公告)日：2025-01-21

申请号：US16004243

申请日：2018-06-08

Applicant: Intel Corporation

Inventor： Krishnakumar Nair , Andrew Yang , Brian Morris

IPC: G06N3/084 , G06F9/30 , G06N3/045 , G06N3/063

Abstract: Thus, the present disclosure is directed to systems and methods for training neural networks using a tensor that includes a plurality of FP16 values and a plurality of bits that define an exponent shared by some or all of the FP16 values included in the tensor. The FP16 values may include IEEE 754 format 16-bit floating point values and the tensor may include a plurality of bits defining the shared exponent. The tensor may include a shared exponent and FP16 values that include a variable bit-length mantissa and a variable bit-length exponent that may be dynamically set by processor circuitry. The tensor may include a shared exponent and FP16 values that include a variable bit-length mantissa; a variable bit-length exponent that may be dynamically set by processor circuitry; and a shared exponent switch set by the processor circuitry to selectively combine the FP16 value exponent with the shared exponent.

7.

发明公开
ARTIFICIAL NEURAL NETWORK TRAINING USING FLEXIBLE FLOATING POINT TENSORS 审中-公开

公开(公告)号：US20240028905A1

公开(公告)日：2024-01-25

申请号：US18478554

申请日：2023-09-29

Applicant: Intel Corporation

Inventor： Krishnakumar Nair , Andrew Yang , Brian Morris

IPC: G06N3/084 , G06N3/063 , G06N3/045

CPC classification number: G06N3/084 , G06N3/063 , G06N3/045 , G06F9/3013

Abstract: Thus, the present disclosure is directed to systems and methods for training neural networks using a tensor that includes a plurality of FP16 values and a plurality of bits that define an exponent shared by some or all of the FP16 values included in the tensor. The FP16 values may include IEEE 754 format 16-bit floating point values and the tensor may include a plurality of bits defining the shared exponent. The tensor may include a shared exponent and FP16 values that include a variable bit-length mantissa and a variable bit-length exponent that may be dynamically set by processor circuitry. The tensor may include a shared exponent and FP16 values that include a variable bit-length mantissa; a variable bit-length exponent that may be dynamically set by processor circuitry; and a shared exponent switch set by the processor circuitry to selectively combine the FP16 value exponent with the shared exponent.

8.

发明授权
Apparatus and method for coherent, accelerated conversion between data representations 有权

公开(公告)号：US10761757B2

公开(公告)日：2020-09-01

申请号：US16024812

申请日：2018-06-30

Applicant: INTEL CORPORATION

Inventor： Krishnakumar Nair , Andrew Yang , Michael Rotzin , Nitin Garegrat , Tom Schebye , Tony Werner

IPC: G06F3/06 , G06F9/30 , G06N3/08

Abstract: An apparatus and method for a converting tensor data. For example, one embodiment of a method comprises: fetching source tensor blocks of a source tensor data structure, each source tensor block comprising a plurality of source tensor data elements having a first numeric representation, wherein the source tensor data structure comprises a predefined structural arrangement of source tensor blocks; converting the one or more source tensor blocks into one or more destination tensor blocks comprising a plurality of destination tensor data elements having a second numeric representation different from the first numeric representation, wherein the sets of one or more source tensor blocks are converted to one or more corresponding destination tensor blocks in a specified order based on the first and second numeric representations; and storing each individual destination tensor block in a designated memory region to maintain coherency with the predefined structural arrangement of the source tensor blocks.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification