Saving intermediate outputs of a neural network

    公开(公告)号:US11748622B1

    公开(公告)日:2023-09-05

    申请号:US16292236

    申请日:2019-03-04

    CPC classification number: G06N3/082 G06F8/433 G06F16/9024 G06N3/063

    Abstract: A computing system is configured to access intermediate outputs of a neural network by augmenting a data flow graph generated for the neural network. The data flow graph includes a plurality of nodes interconnected by connections, each node representing an operation to be executed by the neural network. To access the intermediate output, the data flow graph is augmented by inserting a node representing an operation that saves the output of a node which produces the intermediate output. The node representing the save operation is inserted while maintaining all existing nodes and connections in the data flow graph, thereby preserving the behavior of the data flow graph. The augmenting can be performed using a compiler that generates the data flow graph from program code.

    Neural network layer-by-layer debugging

    公开(公告)号:US11308396B2

    公开(公告)日:2022-04-19

    申请号:US16455329

    申请日:2019-06-27

    Abstract: Techniques are disclosed for debugging a neural network execution on a target processor. A reference processor may generate a plurality of first reference tensors for the neural network. The neural network may be repeatedly reduced to produce a plurality of lengths. For each of the lengths, a compiler converts the neural network into first machine instructions, the target processor executes the first machine instructions to generate a first device tensor, and the debugger program determines whether the first device tensor matches a first reference tensor. A shortest length is identified for which the first device tensor does not match the first reference tensor. Tensor output is enabled for a lower-level intermediate representation of the shortest neural network, and the neural network is converted into second machine instructions, which are executed by the target processor to generate a second device tensor.

    Transpose operations using processing element array

    公开(公告)号:US11347480B2

    公开(公告)日:2022-05-31

    申请号:US17122136

    申请日:2020-12-15

    Abstract: Provided are integrated circuits and methods for transposing a tensor using processing element array operations. In some cases, it may be necessary to transpose elements of a tensor to perform a matrix operation. The tensor may be decomposed into blocks of data elements having dimensions consistent with the dimensions of a systolic array. An identity multiplication may be performed on each block of data elements loaded into a systolic array and the multiplication products summed in column partitions of a results buffer. The data elements in the column partitions of results buffer can then be mapped to row partitions of a buffer memory for further processing.

    TRANSPOSE OPERATIONS USING PROCESSING ELEMENT ARRAY

    公开(公告)号:US20210096823A1

    公开(公告)日:2021-04-01

    申请号:US17122136

    申请日:2020-12-15

    Abstract: Provided are integrated circuits and methods for transposing a tensor using processing element array operations. In some cases, it may be necessary to transpose elements of a tensor to perform a matrix operation. The tensor may be decomposed into blocks of data elements having dimensions consistent with the dimensions of a systolic array. An identity multiplication may be performed on each block of data elements loaded into a systolic array and the multiplication products summed in column partitions of a results buffer. The data elements in the column partitions of results buffer can then be mapped to row partitions of a buffer memory for further processing.

Patent Agency Ranking