Patent search ap:("INTEL CORPORATION") AND inv:"Amit GRADSTEIN" Page 3

21.

发明公开
INSTRUCTIONS AND SUPPORT FOR HORIZONTAL REDUCTIONS 审中-公开

公开(公告)号：US20240004662A1

公开(公告)日：2024-01-04

申请号：US17856978

申请日：2022-07-02

Applicant: Intel Corporation

Inventor： Menachem ADELMAN , Amit GRADSTEIN , Regev SHEMY , Chitra NATARAJAN , Leonardo BORGES , Chytra SHIVASWAMY , Igor ERMOLAEV , Michael ESPIG , Or BEIT AHARON , Jeff WIEDEMEIER

IPC: G06F9/30

CPC classification number: G06F9/30185 , G06F9/30025 , G06F9/30021

Abstract: Techniques for performing horizontal reductions are described. In some examples, an instance of a horizontal instruction is to include at least one field for an opcode, one or more fields to reference a first source operand, and one or more fields to reference a destination operand, wherein the opcode is to indicate that execution circuitry is, in response to a decoded instance of the single instruction, to at least perform a horizontal reduction using at least one data element of a non-masked data element position of at least the first source operand and store a result of the horizontal reduction in the destination operand.

22.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR MATRIX MULTIPLICATION INSTRUCTIONS 有权

公开(公告)号：US20220414182A1

公开(公告)日：2022-12-29

申请号：US17359519

申请日：2021-06-26

Applicant: Intel Corporation

Inventor： Menachem ADELMAN , Robert VALENTINE , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH , Sagi MELLER , Christopher HUGHES , Evangelos GEORGANAS , Alexander HEINECKE , Mark CHARNEY

IPC: G06F17/16 , G06F7/487

Abstract: Techniques for matrix multiplication are described. In some examples, decode circuitry is to decode a single instruction having fields for an opcode, an indication of a location of a first source operand, an indication of a location of a second source operand, and an indication of a location of a destination operand, wherein the opcode is to indicate that execution circuitry is to at least convert data elements of the first and second source operands from a first floating point representation to a second floating point representation, perform matrix multiplication with the converted data elements, and accumulate results of the matrix multiplication in the destination operand in the first floating point representation; and the execution circuitry is to execute to the decoded instruction as specified by the opcode.

23.

发明申请
SYSTEMS AND METHODS FOR PERFORMING INSTRUCTIONS TO CONVERT TO 16-BIT FLOATING-POINT FORMAT 有权

公开(公告)号：US20220326948A1

公开(公告)日：2022-10-13

申请号：US17851468

申请日：2022-06-28

Applicant: Intel Corporation

Inventor： Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to systems and methods for performing instructions to convert to 16-bit floating-point format. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a first source vector comprising N single-precision elements, and a destination vector comprising at least N 16-bit floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the specified source vector to 16-bit floating-point, the conversion to include truncation and rounding, as necessary, and to store each converted element into a corresponding location of the specified destination vector, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

24.

发明申请
SYSTEMS AND METHODS FOR PERFORMING 16-BIT FLOATING-POINT MATRIX DOT PRODUCT INSTRUCTIONS 有权

公开(公告)号：US20210286620A1

公开(公告)日：2021-09-16

申请号：US17216566

申请日：2021-03-29

Applicant: Intel Corporation

Inventor： Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.

25.

发明申请
INTERLEAVED PIPELINE OF FLOATING-POINT ADDERS 审中-公开

公开(公告)号：US20200310793A1

公开(公告)日：2020-10-01

申请号：US16369743

申请日：2019-03-29

Applicant: Intel Corporation

Inventor： Simon RUBANOVICH , Amit GRADSTEIN , Zeev SPERBER

IPC: G06F9/30

Abstract: Disclosed embodiments relate to an interleaved pipeline of floating-point (FP) adders. In one example, a processor is to execute an instruction specifying an opcode and locations of a M by K first source matrix, a K by N second source matrix, and a M by N destination matrix, the opcode indicating execution circuitry, for each FP element (M, N) of the destination matrix, is to: launch K instances of a pipeline having a first, MULTIPLY stage, during which a FP element (M, K) of the first source matrix and a corresponding FP element (K, N) of the second source matrix are multiplied; concurrently, in an EXPDIFF stage, determine an exponent difference between the product and a previous FP value of the element (M, N) of the destination matrix; and in a second, ADD-BYPASS stage, accumulate the product with the previous FP value and, concurrently, bypassing the accumulated sum to a subsequent pipeline instance.

26.

发明申请
USING FUZZY-JBIT LOCATION OF FLOATING-POINT MULTIPLY-ACCUMULATE RESULTS 审中-公开

公开(公告)号：US20200310757A1

公开(公告)日：2020-10-01

申请号：US16369629

申请日：2019-03-29

Applicant: Intel Corporation

Inventor： Amit GRADSTEIN , Simon RUBANOVICH , Zeev SPERBER

IPC: G06F7/544 , G06F7/483 , G06F9/30

Abstract: Disclosed embodiments relate to performing floating-point (FP) arithmetic. In one example, a processor is to decode an instruction specifying locations of first, second, and third floating-point (FP) operands and an opcode calling for accumulating a FP product of the first and second FP operands with the third FP operand, and execution circuitry to, in a first cycle, generate the FP product having a Fuzzy-Jbit format comprising a sign bit, a 9-bit exponent, and a 25-bit mantissa having two possible positions for a JBit and, in a second cycle, to accumulate the FP product with the third FP operand, while concurrently, based on Jbit positions of the FP product and the third FP operand, determining an exponent adjustment and a mantissa shift control of a result of the accumulation, wherein performing the exponent adjustment concurrently enhances an ability to perform the accumulation in one cycle.

27.

发明申请
EFFICIENT IMPLEMENTATION OF COMPLEX VECTOR FUSED MULTIPLY ADD AND COMPLEX VECTOR MULTIPLY 审中-公开

公开(公告)号：US20190303142A1

公开(公告)日：2019-10-03

申请号：US15941531

申请日：2018-03-30

Applicant: Intel Corporation

Inventor： Raanan SADE , Thierry PONS , Amit GRADSTEIN , Zeev SPERBER , Mark J. CHARNEY , Robert VALENTINE , Eyal Oz-Sinay

IPC: G06F9/30 , G06F17/16 , G06F9/38

Abstract: Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.

28.

发明申请
PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS TO GENERATE SEQUENCES OF INTEGERS IN WHICH INTEGERS IN CONSECUTIVE POSITIONS DIFFER BY A CONSTANT INTEGER STRIDE AND WHERE A SMALLEST INTEGER IS OFFSET FROM ZERO BY AN INTEGER OFFSET 审中-公开

公开(公告)号：US20190286441A1

公开(公告)日：2019-09-19

申请号：US16271675

申请日：2019-02-08

Applicant: INTEL CORPORATION

Inventor： Seth ABRAHAM , Elmoustapha OULD-AHMED-VALL , Robert VALENTINE , Zeev SPERBER , Amit GRADSTEIN

IPC: G06F9/30 , G06F9/345

Abstract: A method of an aspect includes receiving an instruction. The instruction indicates an integer stride, indicates an integer offset, and indicates a destination storage location. A result is stored in the destination storage location in response to the instruction. The result includes a sequence of at least four integers in numerical order with a smallest one of the at least four integers differing from zero by the integer offset and with all integers of the sequence in consecutive positions differing by the integer stride. Other methods, apparatus, systems, and instructions are disclosed.

29.

发明公开
APPARATUSES, METHODS, AND SYSTEMS FOR 8-BIT FLOATING-POINT MATRIX DOT PRODUCT INSTRUCTIONS 审中-公开

公开(公告)号：US20240241722A1

公开(公告)日：2024-07-18

申请号：US18619570

申请日：2024-03-28

Applicant: Intel Corporation

Inventor： Naveen MELLEMPUDI , Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Christopher J. HUGHES , Evangelos GEORGANAS , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F7/499 , G06F9/38

CPC classification number: G06F9/30036 , G06F7/49915 , G06F9/30196 , G06F9/3887

Abstract: Systems, methods, and apparatuses relating to 8-bit floating-point matrix dot product instructions are described. A processor embodiment includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of a destination matrix having single-precision elements, a first source matrix, and a second source matrix, the source matrices having elements that each comprise a quadruple of 8-bit floating-point values, the opcode to indicate execution circuitry is to cause, for each element of the first source matrix and corresponding element of the second source matrix, a conversion of the 8-bit floating-point values to single-precision values, a multiplication of different pairs of converted single-precision values to generate plurality of results, and an accumulation of the results with previous contents of a corresponding element of the destination matrix, decode circuitry to decode the fetched instruction, and the execution circuitry to respond to the decoded instruction as specified by the opcode.

30.

发明公开
SYSTEMS AND METHODS FOR PERFORMING 16-BIT FLOATING-POINT MATRIX DOT PRODUCT INSTRUCTIONS 审中-公开

公开(公告)号：US20240126545A1

公开(公告)日：2024-04-18

申请号：US18397664

申请日：2023-12-27

Applicant: Intel Corporation

Inventor： Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30036 , G06F9/3001 , G06F9/3016 , G06F9/3802

Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification