Patent search ap:("Intel Corporation") AND inv:"AMIT GRADSTEIN" Page 1

1.

发明申请
SYSTEMS, APPARATUSES, AND METHODS FOR PERFORMING A DOUBLE BLOCKED SUM OF ABSOLUTE DIFFERENCES 审中-公开

公开(公告)号：US20170242694A1

公开(公告)日：2017-08-24

申请号：US15445741

申请日：2017-02-28

Applicant: Intel Corporation

Inventor： ELMOUSTAPHA OULD-AHMED-VALL , MOSTAFA HAGOG , ROBERT VALENTINE , AMIT GRADSTEIN , SIMON RUBANOVICH , ZEEV SPERBER

IPC: G06F9/30 , G06F7/544

CPC classification number: G06F9/3001 , G06F7/50 , G06F7/544 , G06F9/30036 , G06F9/3836 , G06F9/3877 , G06F15/78 , G06F2207/5442

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector double block packed sum of absolute differences (SAD) in response to a single vector double block packed sum of absolute differences instruction that includes a destination vector register operand, first and second source operands, an immediate, and an opcode are described.

2.

发明申请
APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS 审中-公开

公开(公告)号：US20180074825A1

公开(公告)日：2018-03-15

申请号：US15809721

申请日：2017-11-10

Applicant: Intel Corporation

Inventor： ELMOUSTAPHA OULD-AHMED-VALL , ROBERT VALENTINE , JESUS CORBAL , BRET L. TOLL , MARK J. CHARNEY , ZEEV SPERBER , AMIT GRADSTEIN

IPC: G06F9/30

CPC classification number: G06F9/30181 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3013 , G06F9/30167 , G06F9/3802 , G06F12/0615

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

3.

发明申请
APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS 审中-公开

公开(公告)号：US20170329605A1

公开(公告)日：2017-11-16

申请号：US15668508

申请日：2017-08-03

Applicant: Intel Corporation

Inventor： ELMOUSTAPHA OULD-AHMED-VALL , ROBERT VALENTINE , JESUS CORBAL SAN ADRIAN , BRET L. TOLL , MARK J. CHARNEY , ZEEV SPERBER , AMIT GRADSTEIN

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30181 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3013 , G06F9/30167 , G06F9/3802 , G06F12/0615

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

4.

发明申请
APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS 审中-公开

公开(公告)号：US20170357510A1

公开(公告)日：2017-12-14

申请号：US15668461

申请日：2017-08-03

Applicant: Intel Corporation

Inventor： ELMOUSTAPHA OULD-AHMED-VALL , ROBERT VALENTINE , JESUS CORBAL SAN ADRIAN , BRET L. TOLL , MARK J. CHARNEY , ZEEV SPERBER , AMIT GRADSTEIN

IPC: G06F9/30

CPC classification number: G06F9/30181 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3013 , G06F9/30167 , G06F9/3802 , G06F12/0615

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

5.

发明申请
APPARATUS AND METHOD OF IMPROVED EXTRACT INSTRUCTIONS 审中-公开

公开(公告)号：US20170242704A1

公开(公告)日：2017-08-24

申请号：US15452631

申请日：2017-03-07

Applicant: Intel Corporation

Inventor： ELMOUSTAPHA OULD-AHMED-VALL , ROBERT VALENTINE , JESUS CORBAL , BRET L. TOLL , MARK J. CHARNEY , ZEEV SPERBER , AMIT GRADSTEIN

IPC: G06F9/30

Abstract: An apparatus is described that includes instruction execution circuitry to execute first, second, third, and fourth instructions, the first and second instructions select a first group of input vector elements from one of multiple first non-overlapping sections of respective first and second input vectors. Each of the multiple first non-overlapping sections have a same bit width as the first group. Both the third and fourth instructions select a second group of input vector elements from one of multiple second non-overlapping sections of respective third and fourth input vectors. The second group has a second bit width that is larger than the first bit width. Each of multiple second non-overlapping sections have a same bit width as the second group. The apparatus includes masking layer circuitry to mask the first and second groups at a first granularity a second granularity.

6.

发明申请
APPARATUS AND METHOD OF IMPROVED INSERT INSTRUCTIONS 审中-公开

公开(公告)号：US20170300332A1

公开(公告)日：2017-10-19

申请号：US15476356

申请日：2017-03-31

Applicant: Intel Corporation

Inventor： ELMOUSTAPHA OULD-AHMED-VALL , ROBERT VALENTINE , JESUS CORBAL SAN ADRIAN , BRET L. TOLL , MARK J. CHARNEY , ZEEV SPERBER , AMIT GRADSTEIN

IPC: G06F9/30

CPC classification number: G06F9/30181 , G06F9/30018 , G06F9/30032 , G06F9/30036 , G06F9/3013 , G06F9/30167 , G06F9/3802 , G06F12/0615

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group. The apparatus also includes masking layer circuitry to mask the first and third instructions at a first resultant vector granularity, and, mask the second and fourth instructions at a second resultant vector granularity.

7.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS TO CONVERT 16-BIT FLOATING-POINT FORMATS 有权

公开(公告)号：US20220100507A1

公开(公告)日：2022-03-31

申请号：US17134046

申请日：2020-12-24

Applicant: Intel Corporation

Inventor： ALEXANDER F. HEINECKE , ROBERT VALENTINE , MARK J. CHARNEY , MENACHEM ADELMAN , CHRISTOPHER J. HUGHES , EVANGELOS GEORGANAS , ZEEV SPERBER , AMIT GRADSTEIN , SIMON RUBANOVICH

IPC: G06F9/30 , G06F9/38

Abstract: Systems, methods, and apparatuses relating to instructions to convert 16-bit floating-point formats are described. In one embodiment, a processor includes fetch circuitry to fetch a single instruction having fields to specify an opcode and locations of a source vector comprising N plurality of 16-bit half-precision floating-point elements, and a destination vector to store N plurality of 16-bit bfloat floating-point elements, the opcode to indicate execution circuitry is to convert each of the elements of the source vector from 16-bit half-precision floating-point format to 16-bit bfloat floating-point format and store each converted element into a corresponding location of the destination vector, decode circuitry to decode the fetched single instruction into a decoded single instruction, and the execution circuitry to respond to the decoded single instruction as specified by the opcode.

8.

发明申请
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR 16-BIT FLOATING-POINT MATRIX DOT PRODUCT INSTRUCTIONS 有权

公开(公告)号：US20220100502A1

公开(公告)日：2022-03-31

申请号：US17134008

申请日：2020-12-24

Applicant: Intel Corporation

Inventor： ALEXANDER F. HEINECKE , ROBERT VALENTINE , MARK J. CHARNEY , MENACHEM ADELMAN , CHRISTOPHER J. HUGHES , EVANGELOS GEORGANAS , ZEEV SPERBER , AMIT GRADSTEIN , SIMON RUBANOVICH

IPC: G06F9/30 , G06F9/38 , G06F17/16 , G06F7/544

Abstract: Systems, methods, and apparatuses relating to 16-bit floating-point matrix dot product instructions are described. In one embodiment, a processor includes fetch circuitry to fetch a single instruction having fields to specify an opcode and locations of a M by N destination matrix having single-precision elements, an M by K first source matrix, and a K by N second source matrix, the source matrices having elements that each comprise a pair of half-precision floating-point values, the opcode to indicate execution circuitry is to cause, for each element of the first source matrix and corresponding element of the second source matrix, a conversion of the half-precision floating-point values to single-precision values, a multiplication of converted single-precision values from first values of the pairs together to generate a first result, a multiplication of converted single-precision values from second values of the pairs together to generate a second result, and an accumulation of the first result and the second result with previous contents of a corresponding element of the destination matrix, decode circuitry to decode the fetched instruction, and the execution circuitry to respond to the decoded instruction as specified by the opcode.

9.

发明申请
APPARATUS AND METHOD OF IMPROVED EXTRACT INSTRUCTIONS 审中-公开

公开(公告)号：US20180081689A1

公开(公告)日：2018-03-22

申请号：US15809818

申请日：2017-11-10

Applicant: Intel Corporation

Inventor： ELMOUSTAPHA OULD-AHMED-VALL , ROBERT VALENTINE , JESUS CORBAL , BRET L. TOLL , MARK J. CHARNEY , ZEEV SPERBER , AMIT GRADSTEIN

IPC: G06F9/30

Abstract: An apparatus is described that includes instruction execution circuitry to execute first, second, third, and fourth instructions, the first and second instructions select a first group of input vector elements from one of multiple first non-overlapping sections of respective first and second input vectors. Each of the multiple first non-overlapping sections have a same bit width as the first group. Both the third and fourth instructions select a second group of input vector elements from one of multiple second non-overlapping sections of respective third and fourth input vectors. The second group has a second bit width that is larger than the first bit width. Each of multiple second non-overlapping sections have a same bit width as the second group. The apparatus includes masking layer circuitry to mask the first and second groups at a first granularity and second granularity.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification