Patent search ap:("INTEL CORPORATION") AND inv:"Elmoustapha OULD-AHMED-VALL" Page 7

61.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR DOT PRODUCTION OPERATIONS 有权

公开(公告)号：US20210132943A1

公开(公告)日：2021-05-06

申请号：US16486960

申请日：2017-07-01

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Dan BAUM , Zeev SPERBER , Jesus CORBAL , Elmoustapha OULD-AHMED-VALL , Bret L. TOLL , Mark J. CHARNEY , Menachem ADELMAN , Barukh ZIV , Alexander HEINECKE , Simon RUBANOVICH

IPC: G06F9/30

Abstract: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a double word with saturation; computing a dot product of bytes and accumulating in to a dword with saturation, where the input bytes can be signed or unsigned and the dword accumulation has output saturation; etc.

62.

发明申请
DEEP LEARNING IMPLEMENTATIONS USING SYSTOLIC ARRAYS AND FUSED OPERATIONS 有权

公开(公告)号：US20210089316A1

公开(公告)日：2021-03-25

申请号：US16582433

申请日：2019-09-25

Applicant: Intel Corporation

Inventor： William RASH , Subramaniam MAIYURAN , Varghese GEORGE , Bret L. TOLL , Rajesh SANKARAN , Robert S. CHAPPELL , Supratim PAL , Alexander F. HEINECKE , Elmoustapha OULD-AHMED-VALL , Gang CHEN

IPC: G06F9/38 , G06N3/04 , G06N3/08 , G06F15/80

Abstract: Disclosed embodiments relate to deep learning implementations using systolic arrays and fused operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of a destination and N source matrices, the opcode indicating the processor is to load the N source matrices from memory, perform N convolutions on the N source matrices to generate N feature maps, and store results of the N convolutions in registers to be passed to an activation layer, wherein the processor is to perform the N convolutions and the activation layer with at most one memory load of each of the N source matrices. The processor further includes scheduling circuitry to schedule execution of the instruction and execution circuitry to execute the instruction as per the opcode.

63.

发明申请
APPARATUS AND METHOD FOR NON-SPATIAL STORE AND SCATTER INSTRUCTIONS 审中-公开

公开(公告)号：US20200210186A1

公开(公告)日：2020-07-02

申请号：US16233418

申请日：2018-12-27

Applicant: Intel Corporation

Inventor： Elmoustapha OULD-AHMED-VALL

IPC: G06F9/345 , G06F9/30 , G06F9/38 , G06F12/0811

Abstract: Embodiments of systems, apparatuses, and methods for storing data elements in a processor are described. For example, execution circuitry executes a decoded instruction, the instruction having a first field identifying a location in main memory, a second field identifying a register storing a data element to be stored at the location in main memory, and an opcode to indicate to execution circuitry to store the data element at the location in main memory without storing the data element in a data cache of the processor, by storing the data element at the location in main memory without storing the data element in the data cache of the processor.

64.

发明申请
SYSTEMS AND METHODS FOR PERFORMING DUPLICATE DETECTION INSTRUCTIONS ON 2D DATA 审中-公开

公开(公告)号：US20200210182A1

公开(公告)日：2020-07-02

申请号：US16232931

申请日：2018-12-26

Applicant: Intel Corporation

Inventor： Christopher J. HUGHES , Michael ESPIG , Dan BAUM , Robert VALENTINE , Bret TOLL , Elmoustapha OULD-AHMED-VALL

IPC: G06F9/30 , G06F9/50 , G06F17/16

Abstract: Disclosed embodiments relate to systems and methods for performing duplicate detection instructions on two-dimensional (2D) data. In one example, a processor includes fetch circuitry to fetch an instruction, decode circuitry to decode the fetched instruction having fields to specify an opcode and locations of a source matrix comprising M×N elements and a destination, the opcode to indicate execution circuitry is to use a plurality of comparators to discover duplicates in the source matrix, and store indications of locations of discovered duplicates in the destination. The execution circuitry to execute the decoded instruction as per the opcode.

65.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR TILE DIAGONAL 审中-公开

公开(公告)号：US20190339972A1

公开(公告)日：2019-11-07

申请号：US16474483

申请日：2017-07-01

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Dan BAUM , Zeev SPERBER , Jesus CORBAL , Elmoustapha OULD-AHMED-VALL , Bret L. TOLL , Mark J. CHARNEY , Alexander HEINECKE

IPC: G06F9/30

Abstract: Embodiments detailed herein relate to matrix operations. In particular, tile diagonal support is described. For example, a processor is detailed having decode circuitry to decode an instruction having fields for an opcode, a source operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to write the identified source operand to each element along a main diagonal of the identified destination matrix operand.

66.

发明申请
VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF 审中-公开

公开(公告)号：US20190227800A1

公开(公告)日：2019-07-25

申请号：US16289506

申请日：2019-02-28

Applicant: Intel Corporation

Inventor： Robert C. VALENTINE , Jesus Corbal SAN ADRIAN , Roger Espasa SANS , Robert D. CAVIN , Bret L. TOLL , Santiago Galan DURAN , Jeffrey G. WIEDEMEIER , Sridhar SAMUDRALA , Milind Baburao GIRKAR , Edward Thomas GROCHOWSKI , Jonathan Cannon HALL , Dennis R. BRADFORD , Elmoustapha OULD-AHMED-VALL , James C. ABEL , Mark CHARNEY , Seth ABRAHAM , Suleyman SAIR , Andrew Thomas FORSYTH , Lisa WU , Charles YOUNT

IPC: G06F9/30

CPC classification number: G06F9/30145 , G06F9/3001 , G06F9/30014 , G06F9/30018 , G06F9/30025 , G06F9/30032 , G06F9/30036 , G06F9/30047 , G06F9/30149 , G06F9/30181 , G06F9/30185 , G06F9/30192 , G06F9/34

Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

67.

发明申请
APPARATUS AND METHOD FOR VECTOR HORIZONTAL LOGICAL INSTRUCTION 审中-公开

公开(公告)号：US20190138303A1

公开(公告)日：2019-05-09

申请号：US16110298

申请日：2018-08-23

Applicant: Intel Corporation

Inventor： Elmoustapha OULD-AHMED-VALL , David GUILLEN FANDOS , Jesus F. SANCHEZ , Guillem SOLE , Roger ESPASA

IPC: G06F9/30 , G06F9/34

Abstract: An apparatus and method are described for performing vector horizontal logical instruction. For example, one embodiment of a processor comprises: fetch logic to fetch an instruction from memory, and execution logic to determine a value of a first set of one or more data elements from a first specified set of bits of an immediate operand, wherein positions of the first set of one or more data elements determined from the first specified set of bits of the immediate operand are based on a first set of one or more index values that have a most significant bit corresponding to a packed data element at a first set of one or more positions of a destination packed data operand and that have a least significant bit corresponding to a data element at a corresponding position of a first source packed data operand.

68.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR BLENDING TWO SOURCE OPERANDS INTO A SINGLE DESTINATION USING A WRITEMASK 审中-公开

公开(公告)号：US20190108029A1

公开(公告)日：2019-04-11

申请号：US16145156

申请日：2018-09-27

Applicant: Intel Corporation

Inventor： Jesus CORBAL SAN ADRIAN , Bret L. TOLL , Robert C. VALENTINE , Jeffrey G. WIEDEMEIER , Sridhar SAMUDRALA , Milind Baburao GIRKAR , Andrew Thomas FORSYTH , Elmoustapha OULD-AHMED-VALL , Dennis R. BRADFORD , Lisa K. WU

IPC: G06F9/30

Abstract: Embodiments of systems, apparatuses, and methods for performing a blend instruction in a computer processor are described. In some embodiments, the execution of a blend instruction causes a data element-by-element selection of data elements of first and second source operands using the corresponding bit positions of a writemask as a selector between the first and second operands and storage of the selected data elements into the destination at the corresponding position in the destination.

69.

发明申请
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO TRANSFORM MATRICES INTO ROW-INTERLEAVED FORMAT 审中-公开

公开(公告)号：US20190102196A1

公开(公告)日：2019-04-04

申请号：US16147254

申请日：2018-09-28

Applicant: Intel Corporation

Inventor： Raanan SADE , Robert VALENTINE , Bret TOLL , Christopher J. HUGHES , Alexander F. HEINECKE , Elmoustapha OULD-AHMED-VALL , Mark J. CHARNEY

IPC: G06F9/30

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of source and destination matrices, wherein the opcode indicates that the processor is to transform the specified source matrix into the specified destination matrix having the row-interleaved format; and execution circuitry to respond to the decoded instruction by transforming the specified source matrix into the specified RowInt-formatted destination matrix by interleaving J elements of each J-element sub-column of the specified source matrix in either row-major or column-major order into a K-wide submatrix of the specified destination matrix, the K-wide submatrix having K columns and enough rows to hold the J elements.

70.

发明申请
APPARATUS AND METHOD FOR SHIFTING AND EXTRACTING PACKED DATA ELEMENTS 审中-公开

公开(公告)号：US20190102192A1

公开(公告)日：2019-04-04

申请号：US15721444

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara MADDURI , Elmoustapha OULD-AHMED-VALL , Robert VALENTINE , Mark CHARNEY

IPC: G06F9/30

Abstract: An apparatus and method for performing right-shifting operations on packed quadword data. For example, one embodiment of a processor comprises: a decoder to decode a right-shift instruction to generate a decoded right-shift instruction; a first source register to store a plurality of packed quadwords data elements; execution circuitry to execute the decoded right-shift instruction, the execution circuitry comprising shift circuitry to right-shift at least first and second packed quadword data elements from first and second packed quadword data element locations, respectively, in the first source register by an amount specified in an immediate value or in a control value in a second source register, to generate first and second right-shifted quadwords; the execution circuitry to cause selection of a specified set of most significant bits of the first and second right-shifted quadwords to be written to least significant bit regions of first and second quadword data element locations, respectively, of a destination register; and the destination register to store the specified set of the most significant bits of the first and second right-shifted quadwords.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification