Patent search ap:("INTEL CORPORATION") AND inv:"Elmoustapha OULD-AHMED-VALL" Page 1

1.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR TILE MATRIX MULTIPLICATION AND ACCUMULATION 有权

公开(公告)号：US20250117222A1

公开(公告)日：2025-04-10

申请号：US18930671

申请日：2024-10-29

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Zeev SPERBER , Mark J. CHARNEY , Bret L. TOLL , Rinat RAPPOPORT , Stanislav SHWARTSMAN , Dan BAUM , Igor YANOVER , Elmoustapha OULD-AHMED-VALL , Menachem ADELMAN , Jesus CORBAL , Yuri GEBIL , Simon RUBANOVICH

IPC: G06F9/30

Abstract: Embodiments detailed herein relate to matrix operations. In particular, matrix (tile) multiply accumulate and negated matrix (tile) multiply accumulate are discussed. For example, in some embodiments decode circuitry to decode an instruction having fields for an opcode, an identifier for a first source matrix operand, an identifier of a second source matrix operand, and an identifier for a source/destination matrix operand; and execution circuitry to execute the decoded instruction to multiply the identified first source matrix operand by the identified second source matrix operand, add a result of the multiplication to the identified source/destination matrix operand, and store a result of the addition in the identified source/destination matrix operand and zero unconfigured columns of identified source/destination matrix operand are detailed.

2.

发明申请
VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF 有权

公开(公告)号：US20240427600A1

公开(公告)日：2024-12-26

申请号：US18827453

申请日：2024-09-06

Applicant: Intel Corporation

Inventor： Robert C. VALENTINE , Jesus Corbal SAN ADRIAN , Roger Espasa SANS , Robert D. CAVIN , Bret L. TOLL , Santiago Galan DURAN , Jeffrey G. WIEDEMEIER , Sridhar SAMUDRALA , Milind Baburao GIRKAR , Edward Thomas GROCHOWSKI , Jonathan Cannon HALL , Dennis R. BRADFORD , Elmoustapha OULD-AHMED-VALL , James C ABEL , Mark CHARNEY , Seth ABRAHAM , Suleyman SAIR , Andrew Thomas FORSYTH , Lisa WU , Charles YOUNT

IPC: G06F9/30 , G06F9/34 , H01L29/66 , H01L29/775 , H01L29/78 , H01L29/786

Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

3.

发明公开
SYSTEMS, METHODS, AND APPARATUSES FOR TILE STORE 审中-公开

公开(公告)号：US20240256276A1

公开(公告)日：2024-08-01

申请号：US18432317

申请日：2024-02-05

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Menachem ADELMAN , Elmoustapha OULD-AHMED-VALL , Bret L. TOLL , Milind B. GIRKAR , Zeev SPERBER , Mark J. CHARNEY , Rinat RAPPOPORT , Jesus CORBAL , Stanislav SHWARTSMAN , Igor YANOVER , Alexander F. HEINECKE , Barukh ZIV , Dan BAUM , Yuri GEBIL , Raanan SADE

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F7/76 , G06F9/38 , G06F17/16

CPC classification number: G06F9/30036 , G06F7/485 , G06F7/4876 , G06F7/762 , G06F9/3001 , G06F9/30032 , G06F9/30043 , G06F9/30109 , G06F9/30112 , G06F9/30134 , G06F9/30145 , G06F9/30149 , G06F9/3016 , G06F9/30185 , G06F9/30196 , G06F9/3818 , G06F9/3836 , G06F17/16 , G06F2212/454

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information.

4.

发明公开
SYSTEMS FOR PERFORMING INSTRUCTIONS TO QUICKLY CONVERT AND USE TILES AS 1D VECTORS 审中-公开

公开(公告)号：US20240126551A1

公开(公告)日：2024-04-18

申请号：US18399014

申请日：2023-12-28

Applicant: Intel Corporation

Inventor： Bret TOLL , Christopher J. HUGHES , Dan BAUM , Elmoustapha OULD-AHMED-VALL , Raanan SADE , Robert VALENTINE , Mark J. CHARNEY , Alexander F. HEINECKE

IPC: G06F9/30

CPC classification number: G06F9/30145 , G06F9/30032 , G06F9/30036 , G06F9/30109

Abstract: Disclosed embodiments relate to systems for performing instructions to quickly convert and use matrices (tiles) as one-dimensional vectors. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode, locations of a two-dimensional (2D) matrix and a one-dimensional (1D) vector, and a group of elements comprising one of a row, part of a row, multiple rows, a column, part of a column, multiple columns, and a rectangular sub-tile of the specified 2D matrix, and wherein the opcode is to indicate a move of the specified group between the 2D matrix and the 1D vector, decode circuitry to decode the fetched instruction; and execution circuitry, responsive to the decoded instruction, when the opcode specifies a move from 1D, to move contents of the specified 1D vector to the specified group of elements.

5.

发明公开
SCALAR CORE INTEGRATION 审中-公开

公开(公告)号：US20240045830A1

公开(公告)日：2024-02-08

申请号：US18450685

申请日：2023-08-16

Applicant: Intel Corporation

Inventor： Joydeep RAY , Aravindh ANANTARAMAN , Abhishek R. APPU , Altug KOKER , Elmoustapha OULD-AHMED-VALL , Valentin ANDREI , Subramaniam MAIYURAN , Nicolas GALOPPO VON BORRIES , Varghese GEORGE , Mike MACPHERSON , Ben ASHBAUGH , Murali RAMADOSS , Vikranth VEMULAPALLI , William SADLER , Jonathan PEARCE , Sungye KIM

IPC: G06F15/80 , G06F9/30 , G06F9/38 , G06T15/00

CPC classification number: G06F15/8069 , G06F9/30163 , G06F9/3877 , G06T15/005 , G06F9/3836

Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.

6.

发明申请
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO TRANSFORM MATRICES INTO ROW-INTERLEAVED FORMAT 有权

公开(公告)号：US20220357950A1

公开(公告)日：2022-11-10

申请号：US17865849

申请日：2022-07-15

Applicant: Intel Corporation

Inventor： Raanan SADE , Robert VALENTINE , Bret TOLL , Christopher J. HUGHES , Alexander F. HEINECKE , Elmoustapha OULD-AHMED-VALL , Mark J. CHARNEY

IPC: G06F9/30

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of source and destination matrices, wherein the opcode indicates that the processor is to transform the specified source matrix into the specified destination matrix having the row-interleaved format; and execution circuitry to respond to the decoded instruction by transforming the specified source matrix into the specified RowInt-formatted destination matrix by interleaving J elements of each J-element sub-column of the specified source matrix in either row-major or column-major order into a K-wide submatrix of the specified destination matrix, the K-wide submatrix having K columns and enough rows to hold the J elements.

7.

发明申请
EFFICIENT MULTIPLY AND ACCUMULATE INSTRUCTION WHEN AN OPERAND IS EQUAL TO OR NEAR A POWER OF TWO 有权

公开(公告)号：US20220197595A1

公开(公告)日：2022-06-23

申请号：US17129636

申请日：2020-12-21

Applicant: Intel Corporation

Inventor： Elmoustapha OULD-AHMED-VALL

IPC: G06F7/544 , G06F7/523 , G06F7/50 , G06F5/01 , G06F9/30

Abstract: Techniques and apparatuses for performing a near multiply and accumulate instruction are described. An apparatus includes decoder circuitry to decode an instruction, the instruction to include a field for an identifier of a first source operand, a field for an identifier of a second source operand, and a field for an identifier of a third source operand. The apparatus also includes execution circuitry to execute the decoded instruction to perform a multiplication of a pair of data elements from the first and second source operands to produce a product data element via a shift operation when at least one data element in the pair of data elements is equal to a power of two or near a power of two or via multiplication of the pair of data elements when the pair of data elements is neither equal to a power of two or near a power of two.

8.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR DUAL COMPLEX MULTIPLY ADD OF SIGNED WORDS 有权

公开(公告)号：US20220107804A1

公开(公告)日：2022-04-07

申请号：US17509917

申请日：2021-10-25

Applicant: Intel Corporation

Inventor： Elmoustapha OULD-AHMED-VALL , Venkateswara R. MADDURI , Mark J. CHARNEY , Robert VALENTINE

IPC: G06F9/30 , G06F7/544

Abstract: Embodiments of systems, apparatuses, and methods for dual complex number multiplication and addition in a processor are described. For example, execution circuitry executes a decoded instruction to multiplex data values from positions in source operands to a multiplier, the source operands including pairs complex numbers, calculate a real part of a product of each pair of complex numbers, add the real part of the product of a first pair of complex numbers to the real part of the product of a second pair of complex numbers to calculate a first real result, and add the real part of the product of a third pair of complex numbers to the real part of the product of a fourth pair of complex numbers to calculate a second real result, and store the results to corresponding positions in the destination operand.

9.

发明申请
AUTONOMOUS VEHICLE ADVANCED SENSING AND RESPONSE 有权

公开(公告)号：US20220084329A1

公开(公告)日：2022-03-17

申请号：US17539083

申请日：2021-11-30

Applicant: Intel Corporation

Inventor： Barath LAKSHAMANAN , Linda L. HURD , Ben J. ASHBAUGH , Elmoustapha OULD-AHMED-VALL , Liwei MA , Jingyi JIN , Justin E. GOTTSCHLICH , Chandrasekaran SAKTHIVEL , Michael S. STRICKLAND , Brian T. LEWIS , Lindsey KUPER , Altug KOKER , Abhishek R. APPU , Prasoonkumar SURTI , Joydeep RAY , Balaji VEMBU , Javier S. TUREK , Naila FAROOQUI

IPC: G07C5/00 , G05D1/00 , G08G1/01 , H04W28/08 , H04L29/08 , G06N20/00 , G06F9/50 , G01C21/34 , B60W30/00 , G06N3/04 , G06N3/063 , G06N3/08 , G06N20/10

Abstract: An autonomous vehicle is provided that includes one or more processors configured to provide a local compute manager to manage execution of compute workloads associated with the autonomous vehicle. The local compute manager can perform various compute operations, including receiving offload of compute operations from to other compute nodes and offloading compute operations to other compute notes, where the other compute nodes can be other autonomous vehicles. The local compute manager can also facilitate autonomous navigation functionality.

10.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR CONTROLLABLE SINE AND/OR COSINE OPERATIONS 有权

公开(公告)号：US20220035630A1

公开(公告)日：2022-02-03

申请号：US17346891

申请日：2021-06-14

Applicant: Intel Corporation

Inventor： Venkateswara R. MADDURI , Elmoustapha OULD-AHMED-VALL , Robert VALENTINE , Jesus CORBAL , Mark J. CHARNEY , Carl MURRAY , Milind GIRKAR , Bret TOLL

IPC: G06F9/30 , G06F7/548

Abstract: Embodiments of systems, apparatuses, and methods for performing vector-packed controllable sine and/or cosine operations in a processor are described. For example, execution circuitry executes a decoded instruction to compute at least a real output value and an imaginary output value based on at least a cosine calculation and a sine calculation, the cosine and sine calculations each based on an index value from a packed data source operand, add the index value with an index increment value from the packed data source operand to create an updated index value, and store the real output value, the imaginary output value, and the updated index value to a packed data destination operand.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification