专利检索 ap:("Intel Corporation") AND inv:"Mark J. Charney" 第 1 页

1.

发明授权
Systems, apparatuses, and methods for addition of partial products 有权

公开(公告)号：US12124846B2

公开(公告)日：2024-10-22

申请号：US18456699

申请日：2023-08-28

申请人： INTEL CORPORATION

发明人： Robert Valentine , Galina Ryvchin , Piotr Majcher , Mark J. Charney , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Milind B. Girkar , Zeev Sperber , Simon Rubanovich , Amit Gradstein

IPC分类号： G06F9/30 , G06F7/544 , G06F9/38

CPC分类号： G06F9/30014 , G06F7/5443 , G06F9/30018 , G06F9/30036 , G06F9/30105 , G06F9/3818

摘要： Embodiments of systems, apparatuses, and methods for fused multiple add. In some embodiments, a decoder decodes a single instruction having an opcode, a destination field representing a destination operand, and fields for a first, second, and third packed data source operand, wherein packed data elements of the first and second packed data source operand are of a first, different size than a second size of packed data elements of the third packed data operand. Execution circuitry then executes the decoded single instruction to perform, for each packed data element position of the destination operand, a multiplication of a M N-sized packed data elements from the first and second packed data sources that correspond to a packed data element position of the third packed data source, add of results from these multiplications to a full-sized packed data element of a packed data element position of the third packed data source, and storage of the addition result in a packed data element position destination corresponding to the packed data element position of the third packed data source, wherein M is equal to the full-sized packed data element divided by N.

2.

发明公开
MATRIX TRANSPOSE AND MULTIPLY 审中-公开

公开(公告)号：US20240329938A1

公开(公告)日：2024-10-03

申请号：US18607024

申请日：2024-03-15

申请人： Intel Corporation

发明人： Menachem Adelman , Robert Valentine , Barukh Ziv , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark J. Charney , Christopher J. Hughes , Alexander F. Heinecke , Evangelos Georganas , Binh Pham

IPC分类号： G06F7/78 , G06F9/30 , G06F17/16

CPC分类号： G06F7/78 , G06F9/3001 , G06F9/3016 , G06F17/16

摘要： Embodiments for a matrix transpose and multiply operation are disclosed. In an embodiment, a processor includes a decoder and execution circuitry. The decoder is to decode an instruction having a format including an opcode field to specify an opcode, a first destination operand field to specify a destination matrix location, a first source operand field to specify a first source matrix location, and a second source operand field to specify a second source matrix location. The execution circuitry is to, in response to the decoded instruction, transpose the first source matrix to generate a transposed first source matrix, perform a matrix multiplication using the transposed first source matrix and the second source matrix to generate a result, and store the result in a destination matrix location.

3.

发明授权
Apparatuses, methods, and systems for 8-bit floating-point matrix dot product instructions 有权

公开(公告)号：US12056489B2

公开(公告)日：2024-08-06

申请号：US18313026

申请日：2023-05-05

申请人： Intel Corporation

发明人： Naveen Mellempudi , Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Christopher J. Hughes , Evangelos Georganas , Zeev Sperber , Amit Gradstein , Simon Rubanovich

IPC分类号： G06F9/30 , G06F7/499 , G06F9/38

CPC分类号： G06F9/30036 , G06F7/49915 , G06F9/30196 , G06F9/3887

摘要： Systems, methods, and apparatuses relating to 8-bit floating-point matrix dot product instructions are described. A processor embodiment includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a destination matrix having single-precision elements, a first source matrix, and a second source matrix, the source matrices having elements that each comprise a quadruple of 8-bit floating-point values, the opcode to indicate execution circuitry is to cause, for each element of the first source matrix and corresponding element of the second source matrix, a conversion of the 8-bit floating-point values to single-precision values, a multiplication of different pairs of converted single-precision values to generate plurality of results, and an accumulation of the results with previous contents of a corresponding element of the destination matrix, decode circuitry to decode the fetched instruction, and the execution circuitry to respond to the decoded instruction as specified by the opcode.

4.

发明授权
Systems and methods for performing instructions to transform matrices into row-interleaved format 有权

公开(公告)号：US11954490B2

公开(公告)日：2024-04-09

申请号：US18309469

申请日：2023-04-28

申请人： Intel Corporation

发明人： Raanan Sade , Robert Valentine , Bret Toll , Christopher J. Hughes , Alexander F. Heinecke , Elmoustapha Ould-Ahmed-Vall , Mark J. Charney

IPC分类号： G06F17/16 , G06F7/53 , G06F9/30

CPC分类号： G06F9/30167 , G06F9/30101 , G06F9/30149

摘要： Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of source and destination matrices, wherein the opcode indicates that the processor is to transform the specified source matrix into the specified destination matrix having the row-interleaved format; and execution circuitry to respond to the decoded instruction by transforming the specified source matrix into the specified RowInt-formatted destination matrix by interleaving J elements of each J-element sub-column of the specified source matrix in either row-major or column-major order into a K-wide submatrix of the specified destination matrix, the K-wide submatrix having K columns and enough rows to hold the J elements.

5.

发明授权
Systems, apparatuses, and methods for addition of partial products 有权

公开(公告)号：US11782709B2

公开(公告)日：2023-10-10

申请号：US17964964

申请日：2022-10-13

申请人： Intel Corporation

发明人： Robert Valentine , Galina Ryvchin , Piotr Majcher , Mark J. Charney , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Milind B. Girkar , Zeev Sperber , Simon Rubanovich , Amit Gradstein

IPC分类号： G06F9/30 , G06F7/544 , G06F9/38

CPC分类号： G06F9/30014 , G06F7/5443 , G06F9/30018 , G06F9/30036 , G06F9/30105 , G06F9/3818

摘要： Embodiments of systems, apparatuses, and methods for fused multiple add. In some embodiments, a decoder decodes a single instruction having an opcode, a destination field representing a destination operand, and fields for a first, second, and third packed data source operand, wherein packed data elements of the first and second packed data source operand are of a first, different size than a second size of packed data elements of the third packed data operand. Execution circuitry then executes the decoded single instruction to perform, for each packed data element position of the destination operand, a multiplication of a M N-sized packed data elements from the first and second packed data sources that correspond to a packed data element position of the third packed data source, add of results from these multiplications to a full-sized packed data element of a packed data element position of the third packed data source, and storage of the addition result in a packed data element position destination corresponding to the packed data element position of the third packed data source, wherein M is equal to the full-sized packed data element divided by N.

6.

发明授权
Systems, methods, and apparatuses for heterogeneous computing 有权

公开(公告)号：US11693691B2

公开(公告)日：2023-07-04

申请号：US17381521

申请日：2021-07-21

申请人： Intel Corporation

发明人： Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield

IPC分类号： G06F9/48 , G06F9/30 , G06F9/38

CPC分类号： G06F9/48 , G06F9/3001 , G06F9/3004 , G06F9/30036 , G06F9/383

摘要： Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

7.

发明授权
Systems and methods for performing 16-bit floating-point matrix dot product instructions 有权

公开(公告)号：US11614936B2

公开(公告)日：2023-03-28

申请号：US17216566

申请日：2021-03-29

申请人： Intel Corporation

发明人： Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Raanan Sade , Menachem Adelman , Zeev Sperber , Amit Gradstein , Simon Rubanovich

IPC分类号： G06F9/30 , G06F9/38

摘要： Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.

8.

发明授权
Apparatus and method of improved insert instructions 有权

公开(公告)号：US11354124B2

公开(公告)日：2022-06-07

申请号：US15668508

申请日：2017-08-03

申请人： Intel Corporation

发明人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

IPC分类号： G06F9/30 , G06F12/06 , G06F9/38

摘要： An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

9.

发明授权
Context save with variable save state size 有权

公开(公告)号：US11275588B2

公开(公告)日：2022-03-15

申请号：US16624178

申请日：2017-07-01

申请人： Intel Corporation

发明人： Robert Valentine , Mark J. Charney , Rinat Rappoport , Vivekananthan Sanjeepan

IPC分类号： G06F9/30

摘要： Embodiments of an apparatus comprising a decoder to decode an instruction having fields for an opcode and a destination operand and execution circuitry to execute the decoded instruction to perform a save of processor state components to an area located at a destination memory address specified by the destination operand, wherein a size of the area is defined by at least one indication of an execution of an instruction operating on a specified group of processor states are described.

10.

发明授权
Systems and methods for performing matrix compress and decompress instructions 有权

公开(公告)号：US11249761B2

公开(公告)日：2022-02-15

申请号：US16934003

申请日：2020-07-20

申请人： Intel Corporation

发明人： Dan Baum , Michael Espig , James Guilford , Wajdi K. Feghali , Raanan Sade , Christopher J. Hughes , Robert Valentine , Bret Toll , Elmoustapha Ould-Ahmed-Vall , Mark J. Charney , Vinodh Gopal , Ronen Zohar , Alexander F. Heinecke

IPC分类号： G06F9/30 , G06F9/38

摘要： Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类