Patent search ap:("INTEL CORPORATION") AND inv:"Raanan SADE" Page 3

21.

发明申请
EFFICIENT IMPLEMENTATION OF COMPLEX VECTOR FUSED MULTIPLY ADD AND COMPLEX VECTOR MULTIPLY 审中-公开

公开(公告)号：US20190303142A1

公开(公告)日：2019-10-03

申请号：US15941531

申请日：2018-03-30

Applicant: Intel Corporation

Inventor： Raanan SADE , Thierry PONS , Amit GRADSTEIN , Zeev SPERBER , Mark J. CHARNEY , Robert VALENTINE , Eyal Oz-Sinay

IPC: G06F9/30 , G06F17/16 , G06F9/38

Abstract: Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.

22.

发明申请
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS SPECIFYING TERNARY TILE LOGIC OPERATIONS 审中-公开

公开(公告)号：US20190042260A1

公开(公告)日：2019-02-07

申请号：US16131376

申请日：2018-09-14

Applicant: Intel Corporation

Inventor： Elmoustapha OULD-AHMED-VALL , Christopher J. HUGHES , Bret TOLL , Dan BAUM , Raanan SADE , Robert VALENTINE , Mark J. CHARNEY , Alexander F. HEINECKE

IPC: G06F9/38 , G06F17/16 , G06F9/30

Abstract: Disclosed embodiments relate to systems and methods for performing instructions specifying ternary tile operations. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction specifying a ternary tile operation, and locations of destination and first, second, and third source matrices, each of the matrices having M rows by N columns; and execution circuitry to respond to the decoded instruction by, for each equal-sized group of K elements of the specified first, second, and third source matrices, generate K results by performing the ternary tile operation in parallel on K corresponding elements of the specified first, second, and third source matrices, and store each of the K results to a corresponding element of the specified destination matrix, wherein corresponding elements of the specified source and destination matrices occupy a same relative position within their associated matrix.

23.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR TILE LOAD 有权

公开(公告)号：US20250004716A1

公开(公告)日：2025-01-02

申请号：US18654951

申请日：2024-05-03

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Menachem ADELMAN , Milind B. GIRKAR , Zeev SPERBER , Mark J. CHARNEY , Bret L. TOLL , Rinat RAPPOPORT , Jesus Corbal , Stanislav SHWARTSMAN , Dan BAUM , Igor YANOVER , Alexander F. HEINECKE , Barukh ZIV , Elmoustapha OULD-AHMED-VALL , Yuri GEBIL , Raanan SADE

IPC: G06F7/485 , G06F7/487 , G06F7/76 , G06F9/30 , G06F9/38 , G06F17/16

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in the form of decode circuitry to decode an instruction having fields for an opcode, a destination matrix operand identifier, and source memory information, and execution circuitry to execute the decoded instruction to load groups of strided data elements from memory into configured rows of the identified destination matrix operand to memory.

24.

发明公开
SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX OPERATIONS 审中-公开

公开(公告)号：US20240192954A1

公开(公告)日：2024-06-13

申请号：US18444254

申请日：2024-02-16

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Mark J. CHARNEY , Elmoustapha OULD-AHMED-VALL , Dan BAUM , Zeev SPERBER , Jesus CORBAL , Bret L. TOLL , Raanan SADE , Igor YANOVER , Yuri GEBIL , Rinat RAPPOPORT , Stanislav SHWARTSMAN , Menachem ADELMAN , Simon RUBANOVICH

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F7/76 , G06F9/38 , G06F17/16

CPC classification number: G06F9/30036 , G06F7/485 , G06F7/4876 , G06F7/762 , G06F9/3001 , G06F9/30032 , G06F9/30043 , G06F9/30109 , G06F9/30112 , G06F9/30134 , G06F9/30145 , G06F9/30149 , G06F9/3016 , G06F9/30185 , G06F9/30196 , G06F9/3818 , G06F9/3836 , G06F17/16 , G06F2212/454

Abstract: Embodiments detailed herein relate to matrix (tile) operations. For example, decode circuitry to decode an instruction having fields for an opcode and a memory address; and execution circuitry to execute the decoded instruction to set a tile configuration for the processor to utilize tiles in matrix operations based on a description retrieved from the memory address, wherein a tile a set of 2-dimensional registers are discussed.

25.

发明公开
SYSTEMS AND METHODS FOR PERFORMING 16-BIT FLOATING-POINT MATRIX DOT PRODUCT INSTRUCTIONS 审中-公开

公开(公告)号：US20240126545A1

公开(公告)日：2024-04-18

申请号：US18397664

申请日：2023-12-27

Applicant: Intel Corporation

Inventor： Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/3016 , G06F9/3802

Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.

26.

发明公开
SYSTEMS AND METHODS FOR PERFORMING MATRIX COMPRESS AND DECOMPRESS INSTRUCTIONS 审中-公开

公开(公告)号：US20240045690A1

公开(公告)日：2024-02-08

申请号：US18460497

申请日：2023-09-01

Applicant: Intel Corporation

Inventor： Dan BAUM , Michael ESPIG , James GUILFORD , Wajdi K. FEGHALI , Raanan SADE , Christopher J. HUGHES , Robert VALENTINE , Bret TOLL , Elmoustapha OULD-AHMED-VALL , Mark J. CHARNEY , Vinodh GOPAL , Ronen ZOHAR , Alexander F. HEINECKE

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30178 , G06F9/30145 , G06F9/30036 , G06F9/3013 , G06F9/3802

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

27.

发明公开
SYSTEMS AND METHODS FOR PERFORMING 16-BIT FLOATING-POINT MATRIX DOT PRODUCT INSTRUCTIONS 审中-公开

公开(公告)号：US20230236834A1

公开(公告)日：2023-07-27

申请号：US18190761

申请日：2023-03-27

Applicant: Intel Corporation

Inventor： Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/3016 , G06F9/3802

Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.

28.

发明申请
APPARATUSES, METHODS, AND SYSTEMS TO PRECISELY MONITOR MEMORY STORE ACCESSES 有权

公开(公告)号：US20230082290A1

公开(公告)日：2023-03-16

申请号：US17862708

申请日：2022-07-12

Applicant: Intel Corporation

Inventor： Ahmad YASIN , Raanan SADE , Liron ZUR , Igor YANOVER , Joseph NUZMAN

IPC: G06F9/30 , G06F11/30 , G06F9/54 , G06F11/34

Abstract: Systems, methods, and apparatuses relating to circuitry to precisely monitor memory store accesses are described. In one embodiment, a system includes a memory, a hardware processor core comprising a decoder to decode an instruction into a decoded instruction, an execution circuit to execute the decoded instruction to produce a resultant, a store buffer, and a retirement circuit to retire the instruction when a store request for the resultant from the execution circuit is queued into the store buffer for storage into the memory, and a performance monitoring circuit to mark the retired instruction for monitoring of post-retirement performance information between being queued in the store buffer and being stored in the memory, enable a store fence after the retired instruction to be inserted that causes previous store requests to complete within the memory, and on detection of completion of the store request for the instruction in the memory, store the post-retirement performance information in storage of the performance monitoring circuit.

29.

发明申请
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO TRANSFORM MATRICES INTO ROW-INTERLEAVED FORMAT 有权

公开(公告)号：US20210216323A1

公开(公告)日：2021-07-15

申请号：US17216635

申请日：2021-03-29

Applicant: Intel Corporation

Inventor： Raanan SADE , Robert VALENTINE , Bret TOLL , Christopher J. HUGHES , Alexander F. HEINECKE , Elmoustapha OULD-AHMED-VALL , Mark J. CHARNEY

IPC: G06F9/30

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transform matrices into a row-interleaved format. In one example, a processor includes fetch and decode circuitry to fetch and decode an instruction having fields to specify an opcode and locations of source and destination matrices, wherein the opcode indicates that the processor is to transform the specified source matrix into the specified destination matrix having the row-interleaved format; and execution circuitry to respond to the decoded instruction by transforming the specified source matrix into the specified RowInt-formatted destination matrix by interleaving J elements of each J-element sub-column of the specified source matrix in either row-major or column-major order into a K-wide submatrix of the specified destination matrix, the K-wide submatrix having K columns and enough rows to hold the J elements.

30.

发明申请
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO CONVERT TO 16-BIT FLOATING-POINT FORMAT 有权

公开(公告)号：US20210124581A1

公开(公告)日：2021-04-29

申请号：US17133255

申请日：2020-12-23

Applicant: Intel Corporation

Inventor： Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification