Patent search ap:("INTEL CORPORATION") AND inv:"Elmoustapha OULD-AHMED-VALL" Page 6

51.

发明申请
APPARATUS AND METHOD FOR SHIFTING QUADWORDS AND EXTRACTING PACKED WORDS 审中-公开

公开(公告)号：US20190102177A1

公开(公告)日：2019-04-04

申请号：US15721382

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara MADDURI , Elmoustapha OULD-AHMED-VALL , Robert VALENTINE , Mark CHARNEY

IPC: G06F9/30

Abstract: An apparatus and method for performing left-shifting operations on packed quadword data. For example, one embodiment of a processor comprises: a decoder to decode a left-shift instruction to generate a decoded left-shift instruction; a first source register to store a plurality of packed quadwords data elements; execution circuitry to execute the decoded left-shift instruction, the execution circuitry comprising shift circuitry to left-shift at least first and second packed quadword data elements from first and second packed quadword data element locations, respectively, in the first source register by an amount specified in an immediate value or in a control value in a second source register, to generate first and second left-shifted quadwords; the execution circuitry to cause selection of 16 most significant bits of the first and second left-shifted quadwords to be written to 16 least significant bit regions of first and second quadword data element locations, respectively, of a destination register; and the destination register to store the specified set of the 16 most significant bits of the first and second left-shifted quadwords.

52.

发明申请
APPARATUS AND METHOD FOR VECTOR BROADCAST AND XORAND LOGICAL INSTRUCTION 审中-公开
Title translation: 矢量广播和XORAND逻辑指导的装置和方法

公开(公告)号：US20160179523A1

公开(公告)日：2016-06-23

申请号：US14582171

申请日：2014-12-23

Applicant: Intel Corporation

Inventor： Elmoustapha OULD-AHMED-VALL , David GUILLEN FANDOS , Jesus F. SANCHEZ , Guillem SOLE , Roger ESPASA

IPC: G06F9/30

CPC classification number: G06F9/30029 , G06F9/30018 , G06F9/30036

Abstract: An apparatus and method are described for performing a vector broadcast and XORAND logical instruction. For example, one embodiment of a processor comprises: fetch logic to fetch an instruction from memory indicating a destination packed data operand, a first source packed data operand, a second source packed data operand, and an immediate operand, and execution logic to determine a bit in the second source packed data operand based a position corresponding to the immediate value, perform a bitwise AND between the first source packed data operand and the determined bit to generate an intermediate result, perform a bitwise XOR between the destination packed data operand and the intermediate result to generate a final result, and store the final result in a storage location indicated by the destination packed data operand.

Abstract translation: 描述了用于执行向量广播和XORAND逻辑指令的装置和方法。例如，处理器的一个实施例包括：提取逻辑，用于从存储器取出指令，指示目的地打包数据操作数，第一源打包数据操作数，第二源打包数据操作数和立即操作数，以及执行逻辑，以确定在第二源打包数据操作数中基于对应于立即数的位置的位，在第一源打包数据操作数和确定的位之间执行按位AND，以产生中间结果，在目的地打包数据操作数与中间结果以产生最终结果，并将最终结果存储在由目的地打包数据操作数指示的存储位置中。

53.

发明申请
VECTOR INSTRUCTION TO COMPUTE COORDIANTE OF NEXT POINT IN A Z-ORDER CURVE 审中-公开
Title translation: 在Z-ORDER曲线中计算下一个点的向量的向量

公开(公告)号：US20160139921A1

公开(公告)日：2016-05-19

申请号：US14542457

申请日：2014-11-14

Applicant: Intel Corporation

Inventor： Arnold Kerry EVANS , Elmoustapha OULD-AHMED-VALL

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30036 , G06F9/30018 , G06F9/30025 , G06F9/30032 , G06F9/30112 , G06F9/3887 , G06F9/3895

Abstract: In one embodiment, a processor includes machine level instructions to compute a next point in a Z-order curve of a specified dimension for a specified coordinate. A processor decode unit is configured to decode an instruction having a source and immediate operands including a first z-curve index, the specified dimension and the specified coordinate. A processor execution unit is configured to execute the decoded instruction to compute the coordinate of the next point by incrementing the coordinate value associated with the specified coordinate to generate a second z-curve index including the incremented coordinate.

Abstract translation: 在一个实施例中，处理器包括用于计算指定坐标的指定维度的Z次曲线中的下一个点的机器级指令。处理器解码单元被配置为对具有包括第一z-曲线索引，指定尺寸和指定坐标的源和立即操作数的指令进行解码。处理器执行单元被配置为通过增加与指定坐标相关联的坐标值来执行解码指令以计算下一点的坐标，以生成包括递增坐标的第二z-曲线索引。

54.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR TILE TRANSPOSE 有权

公开(公告)号：US20250117221A1

公开(公告)日：2025-04-10

申请号：US18920691

申请日：2024-10-18

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Dan BAUM , Zeev SPERBER , Jesus CORBAL , Elmoustapha OULD-AHMED-VALL , Bret L. TOLL , Mark J. CHARNEY , Barukh ZIV , Alexander HEINECKE , Milind GIRKAR , Menachem ADELMAN , Simon RUBANOVICH

IPC: G06F9/30 , G06F7/78

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for a matrix transpose instruction is detailed. In some embodiments, decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry to execute the decoded instruction to transpose each row of elements of the identified source matrix operand into a corresponding column of the identified destination matrix operand are detailed.

55.

发明公开
SYSTEMS AND METHODS OF INSTRUCTIONS TO ACCELERATE MULTIPLICATION OF SPARSE MATRICES USING BITMASKS THAT IDENTIFY NON-ZERO ELEMENTS 审中-公开

公开(公告)号：US20240078285A1

公开(公告)日：2024-03-07

申请号：US18502291

申请日：2023-11-06

Applicant: Intel Corporation

Inventor： Dan BAUM , Chen KOREN , Elmoustapha OULD-AHMED-VALL , Michael ESPIG , Christopher J. HUGHES , Raanan SADE , Robert VALENTINE , Mark J. CHARNEY , Alexander F. HEINECKE

IPC: G06F17/16 , G06F9/30 , G06F9/38

CPC classification number: G06F17/16 , G06F9/3001 , G06F9/30101 , G06F9/3016 , G06F9/3802

Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.

56.

发明申请
INSTRUCTIONS FOR VECTOR MULTIPLICATION OF UNSIGNED WORDS WITH ROUNDING 有权

公开(公告)号：US20220318009A1

公开(公告)日：2022-10-06

申请号：US17573556

申请日：2022-01-11

Applicant: Intel Corporation

Inventor： Venkateswara R. MADDURI , Carl MURRAY , Elmoustapha OULD-AHMED-VALL , Mark J. CHARNEY , Robert VALENTINE , Jesus CORBAL

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to executing a vector multiplication instruction. In one example, a processor includes fetch circuitry to fetch the vector multiplication instruction having fields for an opcode, first and second source identifiers, and a destination identifier, decode circuitry to decode the fetched instruction, execution circuitry to, on each of a plurality of corresponding pairs of fixed-sized elements of the identified first and second sources, execute the decoded instruction to generate a double-sized product of each pair of fixed-sized elements, the double-sized product being represented by at least twice a number of bits of the fixed size, and generate an unsigned fixed-sized result by rounding the most significant fixed-sized portion of the double-sized product to fit into the identified destination.

57.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR TILE STORE 有权

公开(公告)号：US20220291927A1

公开(公告)日：2022-09-15

申请号：US17706428

申请日：2022-03-28

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Menachem ADELMAN , Elmoustapha OULD-AHMED-VALL , Bret L. TOLL , Milind B. GIRKAR , Zeev SPERBER , Mark J. CHARNEY , Rinat RAPPOPORT , Jesus CORBAL , Stanislav SHWARTSMAN , Igor YANOVER , Alexander F. HEINECKE , Barukh ZIV , Dan BAUM , Yuri GEBIL

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F17/16 , G06F7/76 , G06F9/38

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information

58.

发明申请
SYSTEMS AND METHODS FOR PERFORMING MATRIX COMPRESS AND DECOMPRESS INSTRUCTIONS 有权

公开(公告)号：US20220171627A1

公开(公告)日：2022-06-02

申请号：US17672253

申请日：2022-02-15

Applicant: Intel Corporation

Inventor： Dan BAUM , Michael ESPIG , James GUILFORD , Wajdi K. FEGHALI , Raanan SADE , Christopher J. HUGHES , Robert VALENTINE , Bret TOLL , Elmoustapha OULD-AHMED-VALL , Mark J. CHARNEY , Vinodh GOPAL , Ronen ZOHAR , Alexander F. HEINECKE

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

59.

发明申请
VECTOR FRIENDLY INSTRUCTION FORMAT AND EXECUTION THEREOF 有权

公开(公告)号：US20220129274A1

公开(公告)日：2022-04-28

申请号：US17524624

申请日：2021-11-11

Applicant: Intel Corporation

Inventor： Robert C. VALENTINE , Jesus Corbal SAN ADRIAN , Roger Espasa SANS , Robert D. CAVIN , Bret L. TOLL , Santiago Galan DURAN , Jeffrey G. WIEDEMEIER , Sridhar SAMUDRALA , Milind Baburao GIRKAR , Edward Thomas GROCHOWSKI , Jonathan Cannon HALL , Dennis R. BRADFORD , Elmoustapha OULD-AHMED-VALL , James C ABEL , Mark CHARNEY , Seth ABRAHAM , Suleyman SAIR , Andrew Thomas FORSYTH , Lisa WU , Charles YOUNT

IPC: G06F9/30 , G06F9/34

Abstract: A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

60.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR DOT PRODUCTION OPERATIONS 有权

公开(公告)号：US20220058021A1

公开(公告)日：2022-02-24

申请号：US17516023

申请日：2021-11-01

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Dan BAUM , Zeev SPERBER , Jesus CORBAL , Elmoustapha OULD-AHMED-VALL , Bret L. TOLL , Mark J. CHARNEY , Menachem ADELMAN , Barukh ZIV , Alexander HEINECKE , Simon RUBANOVICH

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F17/16 , G06F7/76 , G06F9/38

Abstract: Embodiments detailed herein relate to matrix operations. For example, embodiments of instruction support for matrix (tile) dot product operations are detailed. Exemplary instructions including computing a dot product of signed words and accumulating in a double word with saturation; computing a dot product of bytes and accumulating in to a dword with saturation, where the input bytes can be signed or unsigned and the dword accumulation has output saturation; etc.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification