Flexible-access instructions for efficient access of ML data

    公开(公告)号:US11971949B2

    公开(公告)日:2024-04-30

    申请号:US17173203

    申请日:2021-02-10

    CPC classification number: G06F17/16 G06F9/30101 G06F17/15 G06N3/08

    Abstract: A graphics processing unit (GPU) and a method is disclosed that performs a convolution operation recast as a matrix multiplication operation. The GPU includes a register file, a processor and a state machine. The register file stores data of an input feature map and data of a filter weight kernel. The processor performs a convolution operation on data of the input feature map and data of the filter weight kernel as a matrix multiplication operation. The state machine facilitates performance of the convolution operation by unrolling the data of the input feature map and the data of the filter weight kernel in the register file. The state machine includes control registers that determine movement of data through the register file to perform the matrix multiplication operation on the data in the register file in an unrolled manner.

Patent Agency Ranking