专利检索 ap:("Intel Corporation") AND inv:"Jorge Parra" 第 3 页

21.

发明授权
Method and apparatus for approximation using polynomials 有权

公开(公告)号：US11327754B2

公开(公告)日：2022-05-10

申请号：US16366941

申请日：2019-03-27

申请人： Intel Corporation

发明人： Jorge Parra , Dan Baum , Robert S. Chappell , Michael Espig , Varghese George , Alexander Heinecke , Christopher Hughes , Subramaniam Maiyuran , Prasoonkumar Surti , Ronen Zohar , Elmoustapha Ould-Ahmed-Vall

IPC分类号： G06F9/30 , G06F17/11 , G06F7/544 , G06F9/38 , G06F7/552

摘要： Methods and apparatus for approximation using polynomial functions are disclosed. In one embodiment, a processor comprises decoding and execution circuitry. The decoding circuitry is to decode an instruction, where the instruction comprises a first operand specifying an output location and a second operand specifying a plurality of data element values to be computed. The execution circuitry is to execute the decoded instruction. The execution includes to compute a result for each of the plurality of data element values using a polynomial function to approximate a complex function, where the computation uses coefficients stored in a lookup location for the complex function, and where data element values within different data element value ranges use different sets of coefficients. The execution further includes to store results of the computation in the output location.

22.

发明申请
GRAPHICS PROCESSORS AND GRAPHICS PROCESSING UNITS HAVING DOT PRODUCT ACCUMULATE INSTRUCTION FOR HYBRID FLOATING POINT FORMAT 有权

公开(公告)号：US20220129266A1

公开(公告)日：2022-04-28

申请号：US17428523

申请日：2020-03-14

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Shubra Marwaha , Ashutosh Garg , Supratim Pal , Jorge Parra , Chandra Gurram , Varghese George , Darin Starkey , Guei-Yuan Lueh

IPC分类号： G06F9/30 , G06F7/544 , G06F12/02 , G06F12/0811 , G06F12/0875

摘要： Graphics processors and graphics processing units having dot product accumulate instructions for a hybrid floating point format are disclosed. In one embodiment, a graphics multiprocessor comprises an instruction unit to dispatch instructions and
a processing resource coupled to the instruction unit. The processing resource is configured to receive a dot product accumulate instruction from the instruction unit and to process the dot product accumulate instruction using a bfloat16 number (BF16) format.

23.

发明申请
COMPUTING EFFICIENT CROSS CHANNEL OPERATIONS IN PARALLEL COMPUTING MACHINES USING SYSTOLIC ARRAYS 有权

公开(公告)号：US20220058158A1

公开(公告)日：2022-02-24

申请号：US17518202

申请日：2021-11-03

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Jorge Parra , Supratim Pal , Chandra Gurram

IPC分类号： G06F15/80

摘要： An apparatus to facilitate computing efficient cross channel operations in parallel computing machines using systolic arrays is disclosed. The apparatus includes a plurality of registers and one or more processing elements communicably coupled to the plurality of registers. The one or more processing elements include a systolic array circuit to perform cross-channel operations on source data received from a single source register of the plurality of registers, wherein the systolic array circuit is modified to: receive inputs from the single source register at different stages of the systolic array circuit; perform cross-channel operations at channels of the systolic array circuit; bypass disabled channels of the systolic array circuit, the disabled channels not used to compute the cross-channel operations; and broadcast a final result of a final stage of the systolic array circuit to all channels of a destination register.

24.

发明申请
GRAPHICS PROCESSORS AND GRAPHICS PROCESSING UNITS HAVING DOT PRODUCT ACCUMULATE INSTRUCTION FOR HYBRID FLOATING POINT FORMAT 有权

公开(公告)号：US20210312697A1

公开(公告)日：2021-10-07

申请号：US17304092

申请日：2021-06-14

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Shubra Marwaha , Ashutosh Garg , Supratim Pal , Jorge Parra , Chandra Gurram , Varghese George , Darin Starkey , Guei-Yuan Lueh

IPC分类号： G06T15/06 , G06F9/38 , G06F17/18 , G06F9/30

摘要： Described herein is a graphics processing unit (GPU) comprising a single instruction, multiple thread (SIMT) multiprocessor comprising an instruction cache, a shared memory coupled with the instruction cache, and circuitry coupled with the shared memory and the instruction cache, the circuitry including multiple texture units, a first core including hardware to accelerate matrix operations, and a second core configured to receive an instruction having multiple operands in a bfloat16 (BF16) number format, wherein the multiple operands include a first source operand, a second source operand, and a third source operand, and the BF16 number format is a sixteen-bit floating point format having an eight-bit exponent and process the instruction, wherein to process the instruction includes to multiply the second source operand by the third source operand and add a first source operand to a result of the multiply.

25.

发明授权
Scalable sparse matrix multiply acceleration using systolic arrays with feedback inputs 有权

公开(公告)号：US12039001B2

公开(公告)日：2024-07-16

申请号：US18301386

申请日：2023-04-17

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Jorge Parra , Supratim Pal , Ashutosh Garg , Shubra Marwaha , Chandra Gurram , Darin Starkey , Durgesh Borkar , Varghese George

IPC分类号： G06F17/16 , G06F9/30 , G06F15/80

CPC分类号： G06F17/16 , G06F9/3001 , G06F9/30145 , G06F15/8046

摘要： Described herein is a graphics processor including a plurality of processing clusters coupled with a host interface, each processing cluster comprising a plurality of multiprocessors, the plurality of multiprocessors interconnected via a data interconnect, and each multiprocessor comprising sparse matrix multiply acceleration hardware including a systolic processing array with feedback inputs.

26.

发明授权
Sparse matrix multiplication acceleration mechanism 有权

公开(公告)号：US12008067B2

公开(公告)日：2024-06-11

申请号：US17527324

申请日：2021-11-16

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Mathew Nevin , Jorge Parra , Ashutosh Garg , Shubra Marwaha , Shubh Shah

IPC分类号： G06F17/16 , G06F7/487 , G06F9/30 , G06F13/16

CPC分类号： G06F17/16 , G06F7/4876 , G06F9/3001 , G06F9/30036 , G06F13/1673 , G06F2207/3892

摘要： An apparatus to facilitate acceleration of matrix multiplication operations. The apparatus comprises a systolic array including matrix multiplication hardware to perform multiply-add operations on received matrix data comprising data from a plurality of input matrices and sparse matrix acceleration hardware to detect zero values in the matrix data and perform one or more optimizations on the matrix data to reduce multiply-add operations to be performed by the matrix multiplication hardware.

27.

发明授权
Utilizing structured sparsity in systolic arrays 有权

公开(公告)号：US11977885B2

公开(公告)日：2024-05-07

申请号：US17107823

申请日：2020-11-30

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Jorge Parra , Ashutosh Garg , Chandra Gurram , Chunhui Mei , Durgesh Borkar , Shubra Marwaha , Supratim Pal , Varghese George , Wei Xiong , Yan Li , Yongsheng Liu , Dipankar Das , Sasikanth Avancha , Dharma Teja Vooturi , Naveen K. Mellempudi

IPC分类号： G06F9/30 , G06F9/38 , G06F15/80

CPC分类号： G06F9/30036 , G06F9/3001 , G06F9/30101 , G06F9/3893 , G06F15/8046

摘要： An apparatus to facilitate utilizing structured sparsity in systolic arrays is disclosed. The apparatus includes a processor comprising a systolic array to receive data from a plurality of source registers, the data comprising unpacked source data, structured source data that is packed based on sparsity, and metadata corresponding to the structured source data; identify portions of the unpacked source data to multiply with the structured source data, the portions of the unpacked source data identified based on the metadata; and output, to a destination register, a result of multiplication of the portions of the unpacked source data and the structured source data.

28.

发明公开
HARDWARE ENHANCEMENTS FOR DOUBLE PRECISION SYSTOLIC SUPPORT 审中-公开

公开(公告)号：US20240111826A1

公开(公告)日：2024-04-04

申请号：US17937252

申请日：2022-09-30

申请人： Intel Corporation

发明人： Jiasheng Chen , Kevin Hurd , Changwon Rhee , Jorge Parra , Fangwen Fu , Theo Drane , William Zorn , Peter Caday , Gregory Henry , Guei-Yuan Lueh , Farzad Chehrazi , Amit Karande , Turbo Majumder , Xinmin Tian , Milind Girkar , Hong Jiang

IPC分类号： G06F17/16 , G06F7/544 , G06T1/20

CPC分类号： G06F17/16 , G06F7/5443 , G06T1/20

摘要： An apparatus to facilitate hardware enhancements for double precision systolic support is disclosed. The apparatus includes matrix acceleration hardware having double-precision (DP) matrix multiplication circuitry including a multiplier circuits to multiply pairs of input source operands in a DP floating-point format; adders to receive multiplier outputs from the multiplier circuits and accumulate the multiplier outputs in a high precision intermediate format; an accumulator circuit to accumulate adder outputs from the adders with at least one of a third global source operand on a first pass of the DP matrix multiplication circuitry or an intermediate result from the first pass on a second pass of the DP matrix multiplication circuitry, wherein the accumulator circuit to generate an accumulator output in the high precision intermediate format; and a down conversion and rounding circuit to down convert and round an output of the second pass as final result in the DP floating-point format.

29.

发明公开
COMPUTING EFFICIENT CROSS CHANNEL OPERATIONS IN PARALLEL COMPUTING MACHINES USING SYSTOLIC ARRAYS 审中-公开

公开(公告)号：US20230367740A1

公开(公告)日：2023-11-16

申请号：US18310129

申请日：2023-05-01

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Jorge Parra , Supratim Pal , Chandra Gurram

IPC分类号： G06F15/80

CPC分类号： G06F15/8046 , G06F15/8007 , G06N20/00

摘要： An apparatus to facilitate computing efficient cross channel operations in parallel computing machines using systolic arrays is disclosed. The apparatus includes a plurality of registers and one or more processing elements communicably coupled to the plurality of registers. The one or more processing elements include a systolic array circuit to perform cross-channel operations on source data received from a single source register of the plurality of registers, wherein the systolic array circuit is modified to: receive inputs from the single source register at different stages of the systolic array circuit; perform cross-channel operations at channels of the systolic array circuit; bypass disabled channels of the systolic array circuit, the disabled channels not used to compute the cross-channel operations; and broadcast a final result of a final stage of the systolic array circuit to all channels of a destination register.

30.

发明授权
Scalable sparse matrix multiply acceleration using systolic arrays with feedback inputs 有权

公开(公告)号：US11636174B2

公开(公告)日：2023-04-25

申请号：US17527882

申请日：2021-11-16

申请人： Intel Corporation

发明人： Subramaniam Maiyuran , Jorge Parra , Supratim Pal , Ashutosh Garg , Shubra Marwaha , Chandra Gurram , Darin Starkey , Durgesh Borkar , Varghese George

IPC分类号： G06F17/16 , G06F9/30 , G06F15/80

摘要： Described herein is an accelerator device including a host interface, a fabric interconnect coupled with the host interface, and one or more hardware tiles coupled with the fabric interconnect, the one or more hardware tiles including sparse matrix multiply acceleration hardware including a systolic array with feedback inputs.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类