Patent search ap:("INTEL CORPORATION") AND inv:"Robert VALENTINE" Page 7

61.

发明申请
APPARATUS AND METHOD FOR SHIFTING QUADWORDS AND EXTRACTING PACKED WORDS 审中-公开

公开(公告)号：US20190102177A1

公开(公告)日：2019-04-04

申请号：US15721382

申请日：2017-09-29

Applicant: Intel Corporation

Inventor： Venkateswara MADDURI , Elmoustapha OULD-AHMED-VALL , Robert VALENTINE , Mark CHARNEY

IPC: G06F9/30

Abstract: An apparatus and method for performing left-shifting operations on packed quadword data. For example, one embodiment of a processor comprises: a decoder to decode a left-shift instruction to generate a decoded left-shift instruction; a first source register to store a plurality of packed quadwords data elements; execution circuitry to execute the decoded left-shift instruction, the execution circuitry comprising shift circuitry to left-shift at least first and second packed quadword data elements from first and second packed quadword data element locations, respectively, in the first source register by an amount specified in an immediate value or in a control value in a second source register, to generate first and second left-shifted quadwords; the execution circuitry to cause selection of 16 most significant bits of the first and second left-shifted quadwords to be written to 16 least significant bit regions of first and second quadword data element locations, respectively, of a destination register; and the destination register to store the specified set of the 16 most significant bits of the first and second left-shifted quadwords.

62.

发明申请
SYSTEMS AND METHODS FOR PERFORMING 16-BIT FLOATING-POINT MATRIX DOT PRODUCT INSTRUCTIONS 审中-公开

公开(公告)号：US20190079768A1

公开(公告)日：2019-03-14

申请号：US16186387

申请日：2018-11-09

Applicant: Intel Corporation

Inventor： Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to computing dot products of nibbles in tile operands. In one example, a processor includes decode circuitry to decode a tile dot product instruction having fields for an opcode, a destination identifier to identify a M by N destination matrix, a first source identifier to identify a M by K first source matrix, and a second source identifier to identify a K by N second source matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element (m, n) of the specified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the specified first source matrix by a corresponding nibble of a doubleword element (K,N) of the specified second source matrix, and to accumulate and saturate the eight products with previous contents of the doubleword element.

63.

发明申请
SYSTEMS AND METHODS FOR PERFORMING 16-BIT FLOATING-POINT VECTOR DOT PRODUCT INSTRUCTIONS 审中-公开

公开(公告)号：US20190079767A1

公开(公告)日：2019-03-14

申请号：US16186378

申请日：2018-11-09

Applicant: Intel Corporation

Inventor： Alexander F. HEINECKE , Robert VALENTINE , Mark J. CHARNEY , Raanan SADE , Menachem ADELMAN , Zeev SPERBER , Amit GRADSTEIN , Simon RUBANOVICH

IPC: G06F9/30 , G06F9/38

Abstract: Disclosed embodiments relate to systems and methods for performing 16-bit floating-point vector dot product instructions. In one example, a processor includes fetch circuitry to fetch an instruction having fields to specify an opcode and locations of first source, second source, and destination vectors, the opcode to indicate execution circuitry is to multiply N pairs of 16-bit floating-point formatted elements of the specified first and second sources, and accumulate the resulting products with previous contents of a corresponding single-precision element of the specified destination, decode circuitry to decode the fetched instruction, and execution circuitry to respond to the decoded instruction as specified by the opcode.

64.

发明公开
SYSTEMS AND METHODS OF INSTRUCTIONS TO ACCELERATE MULTIPLICATION OF SPARSE MATRICES USING BITMASKS THAT IDENTIFY NON-ZERO ELEMENTS 审中-公开

公开(公告)号：US20240078285A1

公开(公告)日：2024-03-07

申请号：US18502291

申请日：2023-11-06

Applicant: Intel Corporation

Inventor： Dan BAUM , Chen KOREN , Elmoustapha OULD-AHMED-VALL , Michael ESPIG , Christopher J. HUGHES , Raanan SADE , Robert VALENTINE , Mark J. CHARNEY , Alexander F. HEINECKE

IPC: G06F17/16 , G06F9/30 , G06F9/38

CPC classification number: G06F17/16 , G06F9/3001 , G06F9/30101 , G06F9/3016 , G06F9/3802

Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.

65.

发明公开
SYSTEMS, METHODS, AND APPARATUSES FOR HETEROGENEOUS COMPUTING 审中-公开

公开(公告)号：US20230418655A1

公开(公告)日：2023-12-28

申请号：US18207870

申请日：2023-06-09

Applicant: Intel Corporation

Inventor： Rajesh M. SANKARAN , Gilbert NEIGER , Narayan RANGANATHAN , Stephen R. VAN DOREN , Joseph NUZMAN , Niall D. MCDONNELL , Michael A. O'HANLON , Lokpraveen B. MOSUR , Tracy Garrett DRYSDALE , Eriko NURVITADHI , Asit K. MISHRA , Ganesh VENKATESH , Deborah T. MARR , Nicholas P. CARTER , Jonathan D. PEARCE , Edward T. GROCHOWSKI , Richard J. GRECO , Robert VALENTINE , Jesus CORBAL , Thomas D. FLETCHER , Dennis R. BRADFORD , Dwight P. MANLEY , Mark J. CHARNEY , Jeffrey J. COOK , Paul CAPRIOLI , Koichi YAMADA , Kent D. GLOSSOP , David B. SHEFFIELD

IPC: G06F9/48 , G06F9/30 , G06F9/38

CPC classification number: G06F9/48 , G06F9/3001 , G06F9/383 , G06F9/3004 , G06F9/30036

Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.

66.

发明公开
INTERRUPTIBLE AND RESTARTABLE MATRIX MULTIPLICATION INSTRUCTIONS, PROCESSORS, METHODS, AND SYSTEMS 审中-公开

公开(公告)号：US20230350674A1

公开(公告)日：2023-11-02

申请号：US18220225

申请日：2023-07-10

Applicant: Intel Corporation

Inventor： Edward T. GROCHOWSKI , Asit K. MISHRA , Robert VALENTINE , Mark J. CHARNEY , Simon C. STEELY

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/3001 , G06F9/3865 , G06F9/30036 , G06F9/3861 , G06F9/30145

Abstract: A processor of an aspect includes a decode unit to decode a matrix multiplication instruction. The matrix multiplication instruction is to indicate a first memory location of a first source matrix, is to indicate a second memory location of a second source matrix, and is to indicate a third memory location where a result matrix is to be stored. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the matrix multiplication instruction, is to multiply a portion of the first and second source matrices prior to an interruption, and store a completion progress indicator in response to the interruption. The completion progress indicator to indicate an amount of progress in multiplying the first and second source matrices, and storing corresponding result data to the third memory location, that is to have been completed prior to the interruption.

67.

发明申请
COPY A SUBSET OF STATUS FLAGS FROM A CONTROL AND STATUS REGISTER TO A FLAGS REGISTER 有权

公开(公告)号：US20230098724A1

公开(公告)日：2023-03-30

申请号：US17485374

申请日：2021-09-25

Applicant: Intel Corporation

Inventor： Vedvyas SHANBHOGUE , Robert VALENTINE , Mark CHARNEY , Venkateswara MADDURI

IPC: G06F9/30

Abstract: Techniques for copying a subset of status flags from a control and status register to a flags register in response to an instruction are described. An exemplary instruction includes a field for an opcode, the opcode to indicate execution circuitry is to copy from a first register a saturation flag value, an overflow value, and a carry value to a second register into one or more instructions of a different instruction set.

68.

发明申请
BFLOAT16 COMPARISON INSTRUCTIONS 有权

公开(公告)号：US20230072105A1

公开(公告)日：2023-03-09

申请号：US17463410

申请日：2021-08-31

Applicant: Intel Corporation

Inventor： Alexander HEINECKE , Menachem ADELMAN , Robert VALENTINE , Zeev SPERBER , Amit GRADSTEIN , Mark CHARNEY , Evangelos GEORGANAS , Dhiraj KALAMKAR , Christopher HUGHES , Cristina ANDERSON

IPC: G06F9/30

Abstract: Techniques for comparing BF16 data elements are described. An exemplary BF16 comparison instruction includes fields for an opcode, an identification of a location of a first packed data source operand, and an identification of a location of a second packed data source operand, wherein the opcode is to indicate that execution circuitry is to perform, for a particular data element position of the packed data source operands, a comparison of a data element at that position, and update a flags register based on the comparison.

69.

发明申请
BFLOAT16 SCALE AND/OR REDUCE INSTRUCTIONS 有权

公开(公告)号：US20230068781A1

公开(公告)日：2023-03-02

申请号：US17463382

申请日：2021-08-31

Applicant: Intel Corporation

Inventor： Menachem ADELMAN , Alexander HEINECKE , Robert VALENTINE , Zeev SPERBER , Amit GRADSTEIN , Mark CHARNEY , Evangelos GEORGANAS , Dhiraj KALAMKAR , Christopher HUGHES , Cristina ANDERSON

IPC: G06F9/30

Abstract: Techniques for scale and reduction of BF16 data elements are described. An exemplary instruction includes fields for an having fields for an opcode, an identification of a location of a first packed data source operand, an identification of a location of a second packed data source operand, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operands, a floating point scale operation of a BF16 data element of the first packed data source by multiplying the data element by a power of 2 value, wherein a value of the exponent of the power of 2 value is a floor value of a BF16 data element of the second packed data source, and store a result of the floating point scale operation into a corresponding data element position of the packed data destination operand.

70.

发明申请
DUAL SUM OF QUADWORD 16X16 MULTIPLY AND ACCUMULATE 有权

公开(公告)号：US20220413861A1

公开(公告)日：2022-12-29

申请号：US17359522

申请日：2021-06-26

Applicant: Intel Corporation

Inventor： Venkateswara MADDURI , Cristina ANDERSON , Robert VALENTINE , Mark CHARNEY , Vedvyas SHANBHOGUE

IPC: G06F9/30

Abstract: Techniques for matrix multiplication are described. In some examples, a single instruction having a format of fields for an opcode, one or more fields to indicate a location of a source/destination operand, one or more fields to indicate a location of a first source operand, and one or more fields to indicate a location of a second source operand is used. Wherein the opcode is to indicate that execution circuitry is to: multiply values from corresponding data elements of the first and second sources, add a first subset of the multiplied values to a first value from the source/destination operand and store in a first data element position of the source/destination operand, and add a second subset of the multiplied values to a second value from the source/destination operand and store in a second data element position of the source/destination operand.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification