Patent search ap:("INTEL CORPORATION") AND inv:"Zeev SPERBER" Page 6

51.

发明申请
BFLOAT16 FUSED MULTIPLY INSTRUCTIONS 有权

公开(公告)号：US20230067810A1

公开(公告)日：2023-03-02

申请号：US17463405

申请日：2021-08-31

Applicant: Intel Corporation

Inventor： Alexander HEINECKE , Menachem ADELMAN , Robert VALENTINE , Zeev SPERBER , Amit GRADSTEIN , Mark CHARNEY , Evangelos GEORGANAS , Dhiraj KALAMKAR , Christopher HUGHES , Cristina ANDERSON

IPC: G06F9/30 , G06F7/544

Abstract: Techniques for performing BF16 FMA in response to an instruction are described. In some examples, an instruction has fields for an opcode, an identification of location of a packed data source/destination operand (a first source), an identification of a location of a second packed data source operand, an identification of a location of a third packed data source operand, and an identification of location of a packed data source/destination operand, wherein the opcode is to indicate operand ordering and that execution circuitry is to, per data element position, perform a BF16 value fused multiply-accumulate operation using the first, second, and third source operands and store a result in a corresponding data element position of the source/destination operand

52.

发明申请
BFLOAT16 SQUARE ROOT AND/OR RECIPROCAL SQUARE ROOT INSTRUCTIONS 有权

公开(公告)号：US20230061618A1

公开(公告)日：2023-03-02

申请号：US17463374

申请日：2021-08-31

Applicant: Intel Corporation

Inventor： Menachem ADELMAN , Alexander HEINECKE , Robert VALENTINE , Zeev SPERBER , Amit GRADSTEIN , Mark CHARNEY , Evangelos GEORGANAS , Dhiraj KALAMKAR , Christopher HUGHES , Cristina ANDERSON

IPC: G06F7/552 , G06F9/30

Abstract: Techniques for performing square root or reciprocal square root calculations on BF16 data elements in response to an instruction are described. An example of an instruction is one that includes fields for an opcode, an identification of a location of a packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operand, a calculation of a square root value of a BF16 data element in that position and store a result of each square root into a corresponding data element position of the packed data destination operand.

53.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR MATRIX ADD, SUBTRACT, AND MULTIPLY 有权

公开(公告)号：US20220171623A1

公开(公告)日：2022-06-02

申请号：US17548214

申请日：2021-12-10

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Dan BAUM , Zeev SPERBER , Jesus CORBAL , Elmoustapha OULD-AHMED-VALL , Bret L. TOLL , Mark J. CHARNEY , Barukh ZIV , Alexander HEINECKE , Milind GIRKAR , Simon RUBANOVICH

IPC: G06F9/30 , G06F7/485 , G06F7/487 , G06F17/16 , G06F7/76 , G06F9/38

Abstract: Embodiments detailed herein relate to matrix operations. In particular, support for matrix (tile) addition, subtraction, and multiplication is described. For example, circuitry to support instructions for element-by-element matrix (tile) addition, subtraction, and multiplication are detailed. In some embodiments, for matrix (tile) addition, decode circuitry is to decode an instruction having fields for an opcode, a first source matrix operand identifier, a second source matrix operand identifier, and a destination matrix operand identifier; and execution circuitry is to execute the decoded instruction to, for each data element position of the identified first source matrix operand: add a first data value at that data element position to a second data value at a corresponding data element position of the identified second source matrix operand, and store a result of the addition into a corresponding data element position of the identified destination matrix operand.

54.

发明申请
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO CONVERT TO 16-BIT FLOATING-POINT FORMAT 有权

公开(公告)号：US20210124581A1

公开(公告)日：2021-04-29

申请号：US17133255

申请日：2020-12-23

Applicant: Intel Corporation

Inventor： Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

55.

发明申请
SYSTEMS AND METHODS TO PERFORM FLOATING-POINT ADDITION WITH SELECTED ROUNDING 审中-公开

公开(公告)号：US20200310756A1

公开(公告)日：2020-10-01

申请号：US16370966

申请日：2019-03-30

Applicant: Intel Corporation

Inventor： Simon RUBANOVICH , Amit GRADSTEIN , Zeev SPERBER , Mrinmay DUTTA

IPC: G06F7/499 , G06F7/483 , G06F9/38 , G06F17/16

Abstract: Disclosed embodiments relate to performing floating-point addition with selected rounding. In one example, a processor includes circuitry to decode and execute an instruction specifying locations of first and second floating-point (FP) sources, and an opcode indicating the processor is to: bring the FP sources into alignment by shifting a mantissa of the smaller source FP operand to the right by a difference between their exponents, generating rounding controls based on any bits that escape; simultaneously generate a sum of the FP sources and of the FP sources plus one, the sums having a fuzzy-Jbit format having an additional Jbit into which a carry-out, if any, select one of the sums based on the rounding controls, and generate a result comprising a mantissa-wide number of most-significant bits of the selected sum, starting with the most significant non-zero Jbit.

56.

发明申请
SYSTEMS, METHODS, AND APPARATUSES FOR TILE STORE 审中-公开

公开(公告)号：US20200233666A1

公开(公告)日：2020-07-23

申请号：US16487755

申请日：2017-07-01

Applicant: Intel Corporation

Inventor： Robert VALENTINE , Menachem ADELMAN , Elmoustapha OULD-AHMED-VALL , Bret L. TOLL , Milind B. GIRKAR , Zeev SPERBER , Mark J. CHARNEY , Rinat RAPPOPORT , Jesus CORBAL , Stanislav SHWARTSMAN , Igor YANOVER , Alexander F. HEINECKE , Barukh ZIV , Dan BAUM , Yuri GEBIL

IPC: G06F9/30 , G06F17/16

Abstract: Embodiments detailed herein relate to matrix operations. In particular, the loading of a matrix (tile) from memory. For example, support for a loading instruction is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information, and execution circuitry to execute the decoded instruction to store each data element of configured rows of the identified source matrix operand to memory based on the destination memory information

57.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR FUSED MULTIPLY ADD 审中-公开

公开(公告)号：US20200026515A1

公开(公告)日：2020-01-23

申请号：US16338324

申请日：2016-10-20

Applicant: Intel Corporation

Inventor： Robert Valentine , Galina RYVCHIN , Piotr MAJCHER , Mark J. CHARNEY , Elmoustapha OULD-AHMED-VALL , Jesus CORBAL , Milind B. GIRKAR , Zeev SPERBER , Simon RUBANOVICH , Amit GRADSTEIN

IPC: G06F9/30 , G06F9/38 , G06F7/544

Abstract: In some embodiments, packed data elements of first and second packed data source operands are of a first, different size than a second size of packed data elements of a third packed data operand. Execution circuitry executes decoded single instruction to perform, for each packed data element position of a destination operand, a multiplication of a M N-sized packed data elements from the first and second packed data sources that correspond to a packed data element position of the third packed data source, add of results from these multiplications to a full-sized packed data element of a packed data element position of the third packed data source, and storage of the addition result in a packed data element position destination corresponding to the packed data element position of the third packed data source, wherein M is equal to the full-sized packed data element divided by N.

58.

发明申请
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO CONVERT TO 16-BIT FLOATING-POINT FORMAT 审中-公开

公开(公告)号：US20190079762A1

公开(公告)日：2019-03-14

申请号：US16186384

申请日：2018-11-09

Applicant: Intel Corporation

Inventor： Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

59.

发明申请
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO TRANSPOSE RECTANGULAR TILES 审中-公开

公开(公告)号：US20190042202A1

公开(公告)日：2019-02-07

申请号：US16144889

申请日：2018-09-27

Applicant: Intel Corporation

Inventor： Raanan SADE , Robert VALENTINE , Mark J. CHARNEY , Simon RUBANOVICH , Amit GRADSTEIN , Zeev SPERBER , Bret TOLL , Jesus CORBAL , Christopher J. HUGHES , Alexander F. HEINECKE , Elmoustapha OULD-AHMED-VALL

IPC: G06F7/78 , G06F9/30 , G06F9/38 , G06F15/173

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to transpose rectangular tiles. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first destination, second destination, first source, and second source matrices, the specified opcode to cause the processor to process each of the specified source and destination matrices as a rectangular matrix, decode circuitry to decode the fetched rectangular matrix transpose instruction, and execution circuitry to respond to the decoded rectangular matrix transpose instruction by transposing each row of elements of the specified first source matrix into a corresponding column of the specified first destination matrix and transposing each row of elements of the specified second source matrix into a corresponding column of the specified second destination matrix.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification