OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS

    公开(公告)号:EP3783479A1

    公开(公告)日:2021-02-24

    申请号:EP20200955.1

    申请日:2018-04-30

    申请人: INTEL Corporation

    IPC分类号: G06F9/30

    摘要: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.

    SYSTEM, APPARATUS AND METHOD FOR THROTTLING FUSION OF MICRO-OPERATIONS IN A PROCESSOR

    公开(公告)号:EP4202664A1

    公开(公告)日:2023-06-28

    申请号:EP22208772.8

    申请日:2022-11-22

    申请人: INTEL Corporation

    IPC分类号: G06F9/38

    摘要: In one embodiment, an apparatus includes: a plurality of execution circuits to execute and instruct micro-operations (pops), where a subset of the plurality of execution circuits are capable of execution of a fused µop; a fusion circuit coupled to at least the subset of the plurality of execution circuits, wherein the fusion circuit is to fuse at least some pairs of producer-consumer µops into fused µops; and a fusion throttle circuit coupled to the fusion circuit, wherein the fusion throttle circuit is to prevent a first µop from being fused with another µop based at least in part on historical information associated with the first µop. Other embodiments are described and claimed.

    OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS

    公开(公告)号:EP3407183A3

    公开(公告)日:2019-02-13

    申请号:EP18170154.1

    申请日:2018-04-30

    申请人: INTEL Corporation

    IPC分类号: G06F9/30

    摘要: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a fetch unit to fetch a single instruction having multiple input operands, wherein the multiple input operands have an unequal bit-length, a first input operand having a first bit-length and a second input operand having a second bit-length; a decode unit to decode the single instruction into a decoded instruction; an operand length unit to determine a smaller bit-length of the first bit-length and the second bit-length; and a compute unit to perform a matrix operation on the multiple input operands to generate an output value having a bit length of the smaller bit length.