Patent search ap:("Intel Corporation") AND inv:"Jesus Corbal" Page 1

1.

发明授权
Instruction execution that broadcasts and masks data values at different levels of granularity 有权

公开(公告)号：US12197617B2

公开(公告)日：2025-01-14

申请号：US18357066

申请日：2023-07-21

Applicant: Intel Corporation

Inventor： Elmoustapha Ould-Ahmed-Vall , Robert Valentine , Jesus Corbal , Bret L Toll , Mark J. Charney

IPC: G06F16/27 , G06F9/30 , G06F9/38 , G06F21/62 , G06F21/70

Abstract: An apparatus is described that includes an execution unit to execute a first instruction and a second instruction. The execution unit includes input register space to store a first data structure to be replicated when executing the first instruction and to store a second data structure to be replicated when executing the second instruction. The first and second data structures are both packed data structures. Data values of the first packed data structure are twice as large as data values of the second packed data structure. The execution unit also includes replication logic circuitry to replicate the first data structure when executing the first instruction to create a first replication data structure, and, to replicate the second data structure when executing the second data instruction to create a second replication data structure. The execution unit also includes masking logic circuitry to mask the first replication data structure at a first granularity and mask the second replication data structure at a second granularity. The second granularity is twice as fine as the first granularity.

2.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR ADDITION OF PARTIAL PRODUCTS 有权

公开(公告)号：US20250004763A1

公开(公告)日：2025-01-02

申请号：US18886639

申请日：2024-09-16

Applicant: INTEL CORPORATION

Inventor： Robert Valentine , Galina Ryvchin , Piotr Majcher , Mark J. Charney , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Milind B. Girkar , Zeev Sperber , Simon Rubanovich , Amit Gradstein

IPC: G06F9/30 , G06F7/544 , G06F9/38

Abstract: Embodiments of systems, apparatuses, and methods for fused multiple add. In some embodiments, a decoder decodes a single instruction having an opcode, a destination field representing a destination operand, and fields for a first, second, and third packed data source operand, wherein packed data elements of the first and second packed data source operand are of a first, different size than a second size of packed data elements of the third packed data operand. Execution circuitry then executes the decoded single instruction to perform, for each packed data element position of the destination operand, a multiplication of a M N-sized packed data elements from the first and second packed data sources that correspond to a packed data element position of the third packed data source, add of results from these multiplications to a full-sized packed data element of a packed data element position of the third packed data source, and storage of the addition result in a packed data element position destination corresponding to the packed data element position of the third packed data source, wherein M is equal to the full-sized packed data element divided by N.

3.

发明授权
Systems, methods, and apparatuses for tile matrix multiplication and accumulation 有权

公开(公告)号：US12147804B2

公开(公告)日：2024-11-19

申请号：US17382917

申请日：2021-07-22

Applicant: Intel Corporation

Inventor： Robert Valentine , Zeev Sperber , Mark J. Charney , Bret L. Toll , Rinat Rappoport , Stanislav Shwartsman , Dan Baum , Igor Yanover , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman , Jesus Corbal , Yuri Gebil , Simon Rubanovich

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F7/76 , G06F9/38 , G06F17/16

Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

4.

发明授权
Systems, methods, and apparatuses for tile transpose 有权

公开(公告)号：US12124847B2

公开(公告)日：2024-10-22

申请号：US16474475

申请日：2017-07-01

Applicant: Intel Corporation

Inventor： Robert Valentine , Dan Baum , Zeev Sperber , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Bret L Toll , Mark J. Charney , Barukh Ziv , Alexander Heinecke , Milind Girkar , Menachem Adelman , Simon Rubanovich

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F7/76 , G06F9/38 , G06F17/16

CPC classification number: G06F9/30036 , G06F7/485 , G06F7/4876 , G06F7/762 , G06F9/3001 , G06F9/30032 , G06F9/30043 , G06F9/30109 , G06F9/30112 , G06F9/30134 , G06F9/30145 , G06F9/30149 , G06F9/3016 , G06F9/30185 , G06F9/30196 , G06F9/3818 , G06F9/3836 , G06F17/16 , G06F2212/454

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for a matrix transpose instruction is detailed. In some embodiments, decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to transpose each row of elements of the identified source matrix operand into a corresponding column of the identified destination matrix operand are detailed.

5.

发明授权
Systems, methods, and apparatus for matrix move 有权

公开(公告)号：US12039332B2

公开(公告)日：2024-07-16

申请号：US17587637

申请日：2022-01-28

Applicant: Intel Corporation

Inventor： Robert Valentine , Zeev Sperber , Mark J. Charney , Bret L. Toll , Jesus Corbal , Dan Baum , Alexander Heinecke , Elmoustapha Ould-Ahmed-Vall

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F7/76 , G06F9/38 , G06F17/16

CPC classification number: G06F9/30036 , G06F7/485 , G06F7/4876 , G06F7/762 , G06F9/3001 , G06F9/30032 , G06F9/30043 , G06F9/30109 , G06F9/30112 , G06F9/30134 , G06F9/30145 , G06F9/30149 , G06F9/3016 , G06F9/30185 , G06F9/30196 , G06F9/3818 , G06F9/3836 , G06F17/16 , G06F2212/454

Abstract: Detailed herein are embodiment systems, processors, and methods for matrix move. For example, a processor comprising decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to move each data element of the identified source matrix operand to corresponding data element position of the identified destination matrix operand is described.

6.

发明授权
Systems, methods, and apparatuses for tile store 有权

公开(公告)号：US11977886B2

公开(公告)日：2024-05-07

申请号：US17706413

申请日：2022-03-28

Applicant: Intel Corporation

Inventor： Robert Valentine , Menachem Adelman , Elmoustapha Ould-Ahmed-Vall , Bret L. Toll , Milind B. Girkar , Zeev Sperber , Mark J. Charney , Rinat Rappoport , Jesus Corbal , Stanislav Shwartsman , Igor Yanover , Alexander F. Heinecke , Barukh Ziv , Dan Baum , Yuri Gebil , Raanan Sade

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F7/76 , G06F9/38 , G06F17/16

CPC classification number: G06F9/30036 , G06F7/485 , G06F7/4876 , G06F7/762 , G06F9/3001 , G06F9/30032 , G06F9/30043 , G06F9/30109 , G06F9/30112 , G06F9/30134 , G06F9/30145 , G06F9/30149 , G06F9/3016 , G06F9/30185 , G06F9/30196 , G06F9/3818 , G06F9/3836 , G06F17/16 , G06F2212/454

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information.

7.

发明公开
SYSTEMS AND METHODS FOR EXECUTING A FUSED MULTIPLY-ADD INSTRUCTION FOR COMPLEX NUMBERS 审中-公开

公开(公告)号：US20240126546A1

公开(公告)日：2024-04-18

申请号：US18399473

申请日：2023-12-28

Applicant: Intel Corporation

Inventor： Roman S. Dubtsov , Robert Valentine , Jesus Corbal , Milind Girkar , Elmoustapha Ould-Ahmed-Vall

IPC: G06F9/30

CPC classification number: G06F9/30036 , G06F9/3001

Abstract: Disclosed embodiments relate to executing a vector-complex fused multiply-add instruction. In one example, a method includes fetching an instruction, a format of the instruction including an opcode, a first source operand identifier, a second source operand identifier, and a destination operand identifier, wherein each of the identifiers identifies a location storing a packed data comprising at least one complex number, decoding the instruction, retrieving data associated with the first and second source operand identifiers, and executing the decoded instruction to, for each packed data element position of the identified first and second source operands, cross-multiply the real and imaginary components to generate four products: a product of real components, a product of imaginary components, and two mixed products, generate a complex result by using the four products according to the instruction, and store a result to the corresponding position of the identified destination operand.

8.

发明授权
Apparatus and method for complex by complex conjugate multiplication 有权

公开(公告)号：US11755323B2

公开(公告)日：2023-09-12

申请号：US17672504

申请日：2022-02-15

Applicant: Intel Corporation

Inventor： Venkateswara Madduri , Elmoustapha Ould-Ahmed-Vall , Jesus Corbal , Mark Charney , Robert Valentine , Binwei Yang

IPC: G06F9/30

CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/30105

Abstract: An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction. The execution circuitry includes: multiplier circuitry to select real and imaginary data elements in the first source register and second source, multiply each selected imaginary data element in the first source register with a selected real data element in the second source register, and multiply each selected real data element in the first source register with a selected imaginary data element in the second source register to generate a plurality of imaginary products; adder circuitry to add a first subset of the plurality of imaginary products and subtract a second subset of the plurality of imaginary products to generate a first temporary result, and to add a third subset of the plurality of imaginary products and subtract a fourth subset of the plurality of imaginary products to generate a second temporary result; and accumulation circuitry to combine the first temporary result with first data from a destination register to generate a first final result, combine the second temporary result with second data from the destination register to generate a second final result, and store the first final result and second final result back in the destination register.

9.

发明授权
Systems, methods, and apparatuses for tile store 有权

公开(公告)号：US11714642B2

公开(公告)日：2023-08-01

申请号：US17706428

申请日：2022-03-28

Applicant: Intel Corporation

Inventor： Robert Valentine , Menachem Adelman , Elmoustapha Ould-Ahmed-Vall , Bret L. Toll , Milind B. Girkar , Zeev Sperber , Mark J. Charney , Rinat Rappoport , Jesus Corbal , Stanislav Shwartsman , Igor Yanover , Alexander F. Heinecke , Barukh Ziv , Dan Baum , Yuri Gebil

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F17/16 , G06F7/76 , G06F9/38

CPC classification number: G06F9/30036 , G06F7/485 , G06F7/4876 , G06F7/762 , G06F9/3001 , G06F9/3016 , G06F9/30032 , G06F9/30043 , G06F9/30109 , G06F9/30112 , G06F9/30134 , G06F9/30145 , G06F9/30149 , G06F9/30185 , G06F9/30196 , G06F9/3818 , G06F9/3836 , G06F17/16 , G06F2212/454

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information.

10.

发明公开
SYSTEMS AND METHODS TO LOAD A TILE REGISTER PAIR 审中-公开

公开(公告)号：US20230229446A1

公开(公告)日：2023-07-20

申请号：US18186710

申请日：2023-03-20

Applicant: Intel Corporation

Inventor： Raanan Sade , Simon Rubanovich , Amit Gradstein , Zeev Sperber , Alexander Heinecke , Robert Valentine , Mark J. Charney , Bret Toll , Jesus Corbal , Elmoustapha Ould-Ahmed-Vall , Menachem Adelman

IPC: G06F9/30

CPC classification number: G06F9/30145 , G06F9/30043

Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification