专利检索 ap:("INTEL Corporation") AND inv:"GRAMUNT, Roger" 第 1 页

1.

发明公开
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS 审中-公开

公开(公告)号：EP3783479A1

公开(公告)日：2021-02-24

申请号：EP20200955.1

申请日：2018-04-30

申请人： INTEL Corporation

发明人： DAS, Dipankar , GRAMUNT, Roger , SMELYANSKIY, Mikhail , CORBAL, Jesus , MUDIGERE, Dheevatsa , MELLEMPUDI, Naveen K. , HEINECKE, Alexander F.

IPC分类号： G06F9/30

摘要： One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.

2.

发明公开
SYSTEM, APPARATUS AND METHOD FOR THROTTLING FUSION OF MICRO-OPERATIONS IN A PROCESSOR 审中-公开

公开(公告)号：EP4202664A1

公开(公告)日：2023-06-28

申请号：EP22208772.8

申请日：2022-11-22

申请人： INTEL Corporation

发明人： SYED, Sufiyan , GRAMUNT, Roger , GAUR, Jayesh , DESHPANDE, Priyank

IPC分类号： G06F9/38

摘要： In one embodiment, an apparatus includes: a plurality of execution circuits to execute and instruct micro-operations (pops), where a subset of the plurality of execution circuits are capable of execution of a fused µop; a fusion circuit coupled to at least the subset of the plurality of execution circuits, wherein the fusion circuit is to fuse at least some pairs of producer-consumer µops into fused µops; and a fusion throttle circuit coupled to the fusion circuit, wherein the fusion throttle circuit is to prevent a first µop from being fused with another µop based at least in part on historical information associated with the first µop. Other embodiments are described and claimed.

3.

发明公开
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS 审中-公开

公开(公告)号：EP3407183A3

公开(公告)日：2019-02-13

申请号：EP18170154.1

申请日：2018-04-30

申请人： INTEL Corporation

发明人： DAS, Dipankar , GRAMUNT, Roger , SMELYANSKIY, Mikhail , CORBAL, Jesus , MUDIGERE, Dheevatsa , MELLEMPUDI, Naveen K. , HEINECKE, Alexander F.

IPC分类号： G06F9/30

摘要： One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.

4.

发明公开
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS 审中-公开

公开(公告)号：EP3407183A2

公开(公告)日：2018-11-28

申请号：EP18170154.1

申请日：2018-04-30

申请人： INTEL Corporation

发明人： DAS, Dipankar , GRAMUNT, Roger , SMELYANSKIY, Mikhail , CORBAL, Jesus , MUDIGERE, Dheevatsa , MELLEMPUDI, Naveen K. , HEINECKE, Alexander F.

IPC分类号： G06F9/30

CPC分类号： G06F9/3887 , G06F9/30014 , G06F9/30036 , G06F9/3016 , G06F9/30181 , G06F9/30192 , G06F9/3851 , G06N3/00 , G06T1/20

摘要： One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.