专利检索 ap:("Intel Corporation") AND inv:"Robert Valentine" 第 1 页

1.

发明授权
Systems and methods for performing 16-bit floating-point matrix dot product instructions 有权

公开(公告)号：US11614936B2

公开(公告)日：2023-03-28

申请号：US17216566

申请日：2021-03-29

申请人： Intel Corporation

发明人： Alexander F. Heinecke , Robert Valentine , Mark J. Charney , Raanan Sade , Menachem Adelman , Zeev Sperber , Amit Gradstein , Simon Rubanovich

IPC分类号： G06F9/30 , G06F9/38

摘要： Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.

2.

发明申请
Methods of Hardware and Software-Coordinated Opt-In to Advanced Features on Hetero ISA Platforms 有权

公开(公告)号：US20220374278A1

公开(公告)日：2022-11-24

申请号：US17882175

申请日：2022-08-05

申请人： Intel Corporation

发明人： Toby Opferman , Eliezer Weissmann , Robert Valentine , Russell Cameron Arnold

IPC分类号： G06F9/50 , G06F9/38 , G06F9/48 , G06F9/30 , G06F9/448

摘要： The present disclosure relates to a processor that includes one or more processing elements associated with one or more instruction set architectures. The processor is configured to receive a request from an application executed by a first processing element of the one or more processing elements to enable a feature associated with an instruction set architecture. Additionally, the processor is configured to enable the application to utilize the feature without a system call occurring when the feature is associated with an instruction set architecture associated with the first processing element.

3.

发明授权
Systems and methods for implementing chained tile operations 有权

公开(公告)号：US11416260B2

公开(公告)日：2022-08-16

申请号：US16863951

申请日：2020-04-30

申请人： Intel Corporation

发明人： Christopher J. Hughes , Alexander F. Heinecke , Robert Valentine , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall

IPC分类号： G06F9/38 , G06F9/30 , G06F17/16 , G06F15/78 , G06F15/80

摘要： Disclosed embodiments relate to systems and methods for implementing chained tile operations. In one example, a processor includes fetch circuitry to fetch one or more instructions until a plurality of instructions has been fetched, each instruction to specify source and destination tile operands, decode circuitry to decode the fetched instructions, and execution circuitry, responsive to the decoded instructions, to: identify first and second decoded instructions belonging to a chain of instructions, dynamically select and configure a SIMD path comprising first and second processing engines (PE) to execute the first and second decoded instructions, and set aside the specified destination of the first decoded instruction, and instead route a result of the first decoded instruction from the first PE to be used by the second PE to perform the second decoded instruction.

4.

发明授权
Apparatus and method of improved insert instructions 有权

公开(公告)号：US11354124B2

公开(公告)日：2022-06-07

申请号：US15668508

申请日：2017-08-03

申请人： Intel Corporation

发明人： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L. Toll , Mark J. Charney , Zeev Sperber , Amit Gradstein

IPC分类号： G06F9/30 , G06F12/06 , G06F9/38

摘要： An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

5.

发明申请
APPARATUS AND METHOD FOR COMPLEX MULTIPLICATION 有权

公开(公告)号：US20220129264A1

公开(公告)日：2022-04-28

申请号：US17517351

申请日：2021-11-02

申请人： Intel Corporation

发明人： Robert Valentine , Mark Charney , Raanan Sade , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Roman S. Dubtsov

IPC分类号： G06F9/30 , G06F7/48

摘要： An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiplication of a first complex number and a second complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder, a first source register, and a second source register. The decoder is to decode an instruction to generate the decoded instruction. The first source register is to provide the first complex number and the second source register is to provide the second complex number.

6.

发明授权
Context save with variable save state size 有权

公开(公告)号：US11275588B2

公开(公告)日：2022-03-15

申请号：US16624178

申请日：2017-07-01

申请人： Intel Corporation

发明人： Robert Valentine , Mark J. Charney , Rinat Rappoport , Vivekananthan Sanjeepan

IPC分类号： G06F9/30

摘要： Embodiments of an apparatus comprising a decoder to decode an instruction having fields for an opcode and a destination operand and execution circuitry to execute the decoded instruction to perform a save of processor state components to an area located at a destination memory address specified by the destination operand, wherein a size of the area is defined by at least one indication of an execution of an instruction operating on a specified group of processor states are described.

7.

发明授权
Systems and methods for performing matrix compress and decompress instructions 有权

公开(公告)号：US11249761B2

公开(公告)日：2022-02-15

申请号：US16934003

申请日：2020-07-20

申请人： Intel Corporation

发明人： Dan Baum , Michael Espig , James Guilford , Wajdi K. Feghali , Raanan Sade , Christopher J. Hughes , Robert Valentine , Bret Toll , Elmoustapha Ould-Ahmed-Vall , Mark J. Charney , Vinodh Gopal , Ronen Zohar , Alexander F. Heinecke

IPC分类号： G06F9/30 , G06F9/38

摘要： Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

8.

发明授权
Apparatus and method for vector horizontal add of signed/unsigned words and doublewords 有权

公开(公告)号：US11249754B2

公开(公告)日：2022-02-15

申请号：US15850131

申请日：2017-12-21

申请人： Intel Corporation

发明人： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Mark Charney

IPC分类号： G06F9/30 , G06F7/505

摘要： An apparatus and method for performing a packed horizontal addition of words and doublewords. One embodiment of a processor includes a decoder to decode a packed horizontal add instruction which includes an opcode and one or more operands used to identify a plurality of packed words; a source register to store a plurality of packed words; execution circuitry to execute the decoded instruction, and a destination register to store a final result as a packed result word in a designated data element position. The execution circuitry includes operand selection circuitry to identify first and second packed words from the source register in accordance with the operands and opcode; adder circuitry to add the two packed words to generate a temporary sum; a temporary storage of at least 17 bits to store the temporary sum; and saturation circuitry to saturate the temporary sum if necessary to generate the final result.

9.

发明授权
Instructions for vector multiplication of unsigned words with rounding 有权

公开(公告)号：US11221849B2

公开(公告)日：2022-01-11

申请号：US16642778

申请日：2017-09-27

申请人： Intel Corporation

发明人： Venkateswara R. Madduri , Carl Murray , Elmoustapha Ould-Ahmed-Vall , Mark J. Charney , Robert Valentine , Jesus Corbal

IPC分类号： G06F9/22 , G06F9/30 , G06F9/38

摘要： Disclosed embodiments relate to executing a vector multiplication instruction. In one example, a processor includes fetch circuitry to fetch the vector multiplication instruction having fields for an opcode, first and second source identifiers, and a destination identifier, decode circuitry to decode the fetched instruction, execution circuitry to, on each of a plurality of corresponding pairs of fixed-sized elements of the identified first and second sources, execute the decoded instruction to generate a double-sized product of each pair of fixed-sized elements, the double-sized product being represented by at least twice a number of bits of the fixed size, and generate an unsigned fixed-sized result by rounding the most significant fixed-sized portion of the double-sized product to fit into the identified destination.

10.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR MOVING DATA BETWEEN TILES OF A MATRIX OPERATIONS ACCELERATOR AND VECTOR REGISTERS 有权

公开(公告)号：US20210406018A1

公开(公告)日：2021-12-30

申请号：US16914347

申请日：2020-06-27

申请人： INTEL CORPORATION

发明人： Menachem Adelman , Robert Valentine , Barukh Ziv , Yaroslav Pollak , Gideon Stupp , Amit Gradstein , Simon Rubanovich , Zeev Sperber , Mark Charney , Christopher Hughes , Alexander Heinecke

IPC分类号： G06F9/30 , G06F17/16

摘要： Systems, methods, and apparatuses relating to one or more instructions that utilize direct paths for loading data into a tile from a vector register and/or storing data from a tile into a vector register are described. In one embodiment, a system includes a matrix operations accelerator circuit comprising a two-dimensional grid of processing elements, a plurality of registers that represents a two-dimensional matrix coupled to the two-dimensional grid of processing elements, and a coupling to a cache; and a hardware processor core comprising: a vector register, a decoder to decode a single instruction into a decoded single instruction, the single instruction including a first field that identifies the two-dimensional matrix, a second field that identifies a set of elements of the two-dimensional matrix, and a third field that identifies the vector register, and an execution circuit to execute the decoded single instruction to cause a store of the set of elements from the plurality of registers that represents the two-dimensional matrix into the vector register by a coupling of the hardware processor core to the matrix operations accelerator circuit that is separate from the coupling to the cache.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类