专利检索 ap:("Intel Corporation") AND inv:"Naveen Mellempudi" 第 1 页

1.

发明公开
8-BIT FLOATING POINT SQUARE ROOT AND/OR RECIPROCAL SQUARE ROOT INSTRUCTIONS 审中-公开

公开(公告)号：US20240045683A1

公开(公告)日：2024-02-08

申请号：US17958371

申请日：2022-10-01

申请人： Intel Corporation

发明人： Alexander Heinecke , Menachem Adelman , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber

IPC分类号： G06F9/30

CPC分类号： G06F9/30145 , G06F9/30036 , G06F9/3001

摘要： Techniques for performing square root or reciprocal square root calculations on FP8 data elements in response to an instruction are described. An example of an instruction is one that includes fields for an opcode, an identification of a location of a packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operand, a calculation of a square root value of a FP8 data element in that position and store a result of each square root into a corresponding data element position of the packed data destination operand.

2.

发明公开
INSTRUCTIONS TO CONVERT FROM FP16 TO FP8 审中-公开

公开(公告)号：US20240045677A1

公开(公告)日：2024-02-08

申请号：US17958378

申请日：2022-10-01

申请人： Intel Corporation

发明人： Alexander Heinecke , Menachem Adelman , Mark Charney , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber , Robert Valentine

IPC分类号： G06F9/30

CPC分类号： G06F9/30025 , G06F9/3016

摘要： Techniques for converting FP16 or FP32 data elements to FP8 data elements using a single instruction are described. An exemplary apparatus includes decoder circuitry to decode a single instruction, the single instruction to include a one or more fields to identify a source operand, one or more fields to identify a destination operand, and one or more fields for an opcode, the opcode to indicate that execution circuitry is to convert packed half-precision floating-point data or single-precision floating point data from the identified source to packed FP8 data and store the packed bfloat8 data into corresponding data element positions of the identified destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision floating-point data or single-precision floating point data from the identified source to packed bfloat8 data and store the packed bfloat8 data into corresponding data element positions.

3.

发明公开
8-BIT FLOATING POINT SOURCE ARITHMETIC INSTRUCTIONS 审中-公开

公开(公告)号：US20240045654A1

公开(公告)日：2024-02-08

申请号：US17958373

申请日：2022-10-01

申请人： Intel Corporation

发明人： Alexander Heinecke , Menachem Adelman , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber

IPC分类号： G06F7/483

CPC分类号： G06F7/483

摘要： Techniques for performing arithmetic operations on FP8 values are described. An exemplary instruction includes fields for an opcode, an identification of a location of a first packed data source operand, an identification of a location of a second packed data source operand, and an identification of location of a packed data destination operand, wherein the opcode is to indicate an arithmetic operation execution circuitry is to perform, for each data element position of the identified packed data source operands, the arithmetic operation on FP8 data elements in that data element position in FP8 format and store a result of each arithmetic operation into a corresponding data element position of the identified packed data destination operand.

4.

发明授权
Conversion hardware mechanism 有权

公开(公告)号：US11494163B2

公开(公告)日：2022-11-08

申请号：US16562979

申请日：2019-09-06

申请人： Intel Corporation

发明人： Naveen Mellempudi , Dipankar Das , Chunhui Mei , Kristopher Wong , Dhiraj D. Kalamkar , Hong H. Jiang , Subramaniam Maiyuran , Varghese George

IPC分类号： G06F7/499 , G06F17/16 , G06T1/20 , G06N3/04 , G06N3/08

摘要： An apparatus to facilitate a computer number format conversion is disclosed. The apparatus comprises a control unit to receive to receive data format information indicating a first precision data format that input data is to be received and converter hardware to receive the input data and convert the first precision data format to a second precision data format based on the data format information.

5.

发明授权
Hardware apparatuses and methods relating to elemental register accesses 有权

公开(公告)号：US09996347B2

公开(公告)日：2018-06-12

申请号：US14582784

申请日：2014-12-24

申请人： Intel Corporation

发明人： Victor Lee , Ugonna Echeruo , George Chrysos , Naveen Mellempudi

IPC分类号： G06F9/30

CPC分类号： G06F9/30036

摘要： Methods and apparatuses relating to a vector instruction with a register operand with an elemental offset are described. In one embodiment, a hardware processor includes a decode unit to decode a vector instruction with a register operand with an elemental offset to access a first number of elements in a register specified by the register operand, wherein the first number is a total number of elements in the register minus the elemental offset, access a second number of elements in a next logical register, wherein the second number is the elemental offset, and combine the first number of elements and the second number of elements as a data vector, and an execution unit to execute the vector instruction on the data vector.

6.

发明授权
Apparatuses, methods, and systems for 8-bit floating-point matrix dot product instructions 有权

公开(公告)号：US12056489B2

公开(公告)日：2024-08-06

申请号：US18313026

申请日：2023-05-05

申请人： Intel Corporation

发明人： Naveen Mellempudi , Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Christopher J. Hughes , Evangelos Georganas , Zeev Sperber , Amit Gradstein , Simon Rubanovich

IPC分类号： G06F9/30 , G06F7/499 , G06F9/38

CPC分类号： G06F9/30036 , G06F7/49915 , G06F9/30196 , G06F9/3887

摘要： Systems, methods, and apparatuses relating to 8-bit floating-point matrix dot product instructions are described. A processor embodiment includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a destination matrix having single-precision elements, a first source matrix, and a second source matrix, the source matrices having elements that each comprise a quadruple of 8-bit floating-point values, the opcode to indicate execution circuitry is to cause, for each element of the first source matrix and corresponding element of the second source matrix, a conversion of the 8-bit floating-point values to single-precision values, a multiplication of different pairs of converted single-precision values to generate plurality of results, and an accumulation of the results with previous contents of a corresponding element of the destination matrix, decode circuitry to decode the fetched instruction, and the execution circuitry to respond to the decoded instruction as specified by the opcode.

7.

发明公开
SYSTEMS AND METHODS FOR PERFORMING 8-BIT FLOATING-POINT VECTOR DOT PRODUCT INSTRUCTIONS 审中-公开

公开(公告)号：US20240045689A1

公开(公告)日：2024-02-08

申请号：US17958377

申请日：2022-10-01

申请人： Intel Corporation

发明人： Alexander Heinecke , Menachem Adelman , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber

IPC分类号： G06F9/30 , G06F7/487 , G06F17/16 , G06F9/38

CPC分类号： G06F9/3016 , G06F7/4876 , G06F17/16 , G06F9/3802 , G06F9/3013 , G06F9/3001

摘要： Disclosed embodiments relate to systems and methods for performing 8-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply pairs of 8-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

8.

发明公开
INSTRUCTIONS TO CONVERT FROM FP16 TO FP8 审中-公开

公开(公告)号：US20240045684A1

公开(公告)日：2024-02-08

申请号：US17958380

申请日：2022-10-01

申请人： Intel Corporation

发明人： Alexander Heinecke , Menachem Adelman , Mark Charney , Evangelos Georganas , Amit Gradstein , Christopher Hughes , Naveen Mellempudi , Simon Rubanovich , Uri Sherman , Zeev Sperber , Robert Valentine

IPC分类号： G06F9/30

CPC分类号： G06F9/30145 , G06F9/30036 , G06F9/30018

摘要： Techniques for converting FP16 to BF8 using bias are described. An example embodiment utilizes decoder circuitry to decode a single instruction, the single instruction to include one or more fields to identify a first source operand, one or more fields to identify a second source operand, one or more fields to identify a source/destination operand, and one or more fields for an opcode, wherein the opcode is to indicate that execution circuitry is to convert packed half-precision data from the identified first and second sources to packed FP8 data using bias terms from the identified source/destination operand and store the packed FP8 data into corresponding data element positions of the identified source/destination operand; and execution circuitry to execute the decoded instruction according to the opcode to convert packed half-precision data from the identified first and second sources to packed FP8 data using bias terms from the identified source/destination operand and store the packed FP8 data into corresponding data element positions of the identified source/destination operand.

9.

发明授权
Incremental precision networks using residual inference and fine-grain quantization 有权

公开(公告)号：US11893490B2

公开(公告)日：2024-02-06

申请号：US18060414

申请日：2022-11-30

申请人： Intel Corporation

发明人： Abhisek Kundu , Naveen Mellempudi , Dheevatsa Mudigere , Dipankar Das

IPC分类号： G06N3/08 , G06N5/04 , G06T15/00 , G06F9/46 , G06N3/063 , G06N3/084 , G06N3/044 , G06N3/045 , G06T17/20 , G06T15/80 , G06T17/10 , G06T15/04 , G06V10/94

CPC分类号： G06N3/08 , G06F9/46 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06N5/04 , G06T15/005 , G06T15/04 , G06T15/80 , G06T17/10 , G06T17/20 , G06V10/94

摘要： One embodiment provides for a computer-readable medium storing instructions that cause one or more processors to perform operations comprising determining a per-layer scale factor to apply to tensor data associated with layers of a neural network model and converting the tensor data to converted tensor data. The tensor data may be converted from a floating point datatype to a second datatype that is an 8-bit datatype. The instructions further cause the one or more processors to generate an output tensor based on the converted tensor data and the per-layer scale factor.

10.

发明申请
SYSTOLIC ARRAY HAVING SUPPORT FOR OUTPUT SPARSITY 有权

公开(公告)号：US20220413803A1

公开(公告)日：2022-12-29

申请号：US17304803

申请日：2021-06-25

申请人： Intel Corporation

发明人： Jorge Parra , Fangwen Fu , Subramaniam Maiyuran , Varghese George , Mike Macpherson , Supratim Pal , Chandra Gurram , Sabareesh Ganapathy , Sasikanth Avancha , Dharma Teja Vooturi , Naveen Mellempudi , Dipankar Das

IPC分类号： G06F7/544 , G06F7/523 , G06F15/80 , G06F17/16

摘要： A processing apparatus is described herein that includes a general-purpose parallel processing engine comprising a matrix accelerator including one or more systolic arrays, at least one of the one or more systolic arrays comprising multiple pipeline stages, each pipeline stage of the multiple pipeline stages including multiple processing elements, the multiple processing elements configured to perform processing operations on input matrix elements based on output sparsity metadata. The output sparsity metadata indicates to the multiple processing elements to bypass multiplication for a first row of elements of a second matrix and multiply a second row of elements of the second matrix with a column of matrix elements of a first matrix.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类