Apparatus, methods, and systems for memory consistency in a configurable spatial accelerator

    公开(公告)号:US10417175B2

    公开(公告)日:2019-09-17

    申请号:US15859466

    申请日:2017-12-30

    Abstract: Methods and apparatuses relating to consistency in an accelerator are described. In one embodiment, request address file (RAF) circuits are coupled to a spatial array by a first network, a memory is coupled to the RAF circuits by a second network, a RAF circuit is to not issue, into the second network, a request to the memory marked with a program order dependency on a previous request until receiving a first token generated by completion of the previous request to the memory by another RAF circuit, and a second RAF circuit is to not issue, into the second network, a second request to the memory marked with a program order dependency on a first request until receiving a second token sent by a first RAF circuit when a predetermined time period has lapsed since the first request was issued by the first RAF circuit into the second network.

    Interruptible and restartable matrix multiplication instructions, processors, methods, and systems

    公开(公告)号:US10275243B2

    公开(公告)日:2019-04-30

    申请号:US15201442

    申请日:2016-07-02

    Abstract: A processor of an aspect includes a decode unit to decode a matrix multiplication instruction. The matrix multiplication instruction is to indicate a first memory location of a first source matrix, is to indicate a second memory location of a second source matrix, and is to indicate a third memory location where a result matrix is to be stored. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the matrix multiplication instruction, is to multiply a portion of the first and second source matrices prior to an interruption, and store a completion progress indicator in response to the interruption. The completion progress indicator to indicate an amount of progress in multiplying the first and second source matrices, and storing corresponding result data to the third memory location, that is to have been completed prior to the interruption.

    Interruptible and restartable matrix multiplication instructions, processors, methods, and systems

    公开(公告)号:US12204898B2

    公开(公告)日:2025-01-21

    申请号:US18240287

    申请日:2023-08-30

    Abstract: A processor of an aspect includes a decode unit to decode a matrix multiplication instruction. The matrix multiplication instruction is to indicate a first memory location of a first source matrix, is to indicate a second memory location of a second source matrix, and is to indicate a third memory location where a result matrix is to be stored. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the matrix multiplication instruction, is to multiply a portion of the first and second source matrices prior to an interruption, and store a completion progress indicator in response to the interruption. The completion progress indicator to indicate an amount of progress in multiplying the first and second source matrices, and storing corresponding result data to the third memory location, that is to have been completed prior to the interruption.

    Processors, methods, and systems for debugging a configurable spatial accelerator

    公开(公告)号:US11086816B2

    公开(公告)日:2021-08-10

    申请号:US15719281

    申请日:2017-09-28

    Abstract: Systems, methods, and apparatuses relating to debugging a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements. At least a first of the plurality of processing elements is to enter a halted state in response to being represented as a first of the plurality of dataflow operators.

    Processors, methods, and systems with a configurable spatial accelerator

    公开(公告)号:US10515046B2

    公开(公告)日:2019-12-24

    申请号:US15640543

    申请日:2017-07-01

    Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a synchronizer circuit coupled between an interconnect network of a first tile and an interconnect network of a second tile and comprising storage to store data to be sent between the interconnect network of the first tile and the interconnect network of the second tile, the synchronizer circuit to convert the data from the storage between a first voltage or a first frequency of the first tile and a second voltage or a second frequency of the second tile to generate converted data, and send the converted data between the interconnect network of the first tile and the interconnect network of the second tile

Patent Agency Ranking