Breakpoints in neural network accelerator

    公开(公告)号:US12210438B1

    公开(公告)日:2025-01-28

    申请号:US17947949

    申请日:2022-09-19

    Abstract: Techniques are disclosed for setting a breakpoint for debugging a neural network. User input is received by a debugger program executable by a host processor indicating a target layer of a neural network at which to halt execution of the neural network. The neural network includes a first set of instructions to be executed by a first execution engine and a second set of instructions to be executed by a second execution engine. A first halt point is set within the first set of instructions and a second halt point is set within the second set of instructions. It is then determined that operation of the first execution engine and the second execution engine has halted. It is then determined that the first execution engine has reached the first halt point. The second execution engine is then caused to move through instructions until reaching the second halt point.

    Multinomial distribution on an integrated circuit

    公开(公告)号:US10997277B1

    公开(公告)日:2021-05-04

    申请号:US16364837

    申请日:2019-03-26

    Abstract: An integrated circuit device such as a neural network accelerator can be programmed to select a numerical value based on a multinomial distribution. In various examples, the integrated circuit device can include an execution engine that includes multiple separate execution units. The multiple execution units can operate in parallel on different streams of data. For example, to make a selection based on a multinomial distribution, the execution units can be configured to perform cumulative sums on sets of numerical values, where the numerical values represent probabilities. In this example, to then obtain cumulative sums across the sets of numerical values, the largest values from the sets can be accumulated, and then added, in parallel to the sets. The resulting cumulative sum across all the numerical values can then be used to randomly select a specific index, which can provide a particular numerical value as the selected value.

    Breakpoints in neural network accelerator

    公开(公告)号:US11467946B1

    公开(公告)日:2022-10-11

    申请号:US16368351

    申请日:2019-03-28

    Abstract: Techniques are disclosed for setting a breakpoint for debugging a neural network. User input is received by a debugger program executable by a host processor indicating a target layer of a neural network at which to halt execution of the neural network. The neural network includes a first set of instructions to be executed by a first execution engine and a second set of instructions to be executed by a second execution engine. A first halt point is set within the first set of instructions and a second halt point is set within the second set of instructions. It is then determined that operation of the first execution engine and the second execution engine has halted. It is then determined that the first execution engine has reached the first halt point. The second execution engine is then caused to move through instructions until reaching the second halt point.

    Transpose operations using processing element array

    公开(公告)号:US11347480B2

    公开(公告)日:2022-05-31

    申请号:US17122136

    申请日:2020-12-15

    Abstract: Provided are integrated circuits and methods for transposing a tensor using processing element array operations. In some cases, it may be necessary to transpose elements of a tensor to perform a matrix operation. The tensor may be decomposed into blocks of data elements having dimensions consistent with the dimensions of a systolic array. An identity multiplication may be performed on each block of data elements loaded into a systolic array and the multiplication products summed in column partitions of a results buffer. The data elements in the column partitions of results buffer can then be mapped to row partitions of a buffer memory for further processing.

    TRANSPOSE OPERATIONS USING PROCESSING ELEMENT ARRAY

    公开(公告)号:US20210096823A1

    公开(公告)日:2021-04-01

    申请号:US17122136

    申请日:2020-12-15

    Abstract: Provided are integrated circuits and methods for transposing a tensor using processing element array operations. In some cases, it may be necessary to transpose elements of a tensor to perform a matrix operation. The tensor may be decomposed into blocks of data elements having dimensions consistent with the dimensions of a systolic array. An identity multiplication may be performed on each block of data elements loaded into a systolic array and the multiplication products summed in column partitions of a results buffer. The data elements in the column partitions of results buffer can then be mapped to row partitions of a buffer memory for further processing.

    Assisted indirect memory addressing

    公开(公告)号:US10929063B1

    公开(公告)日:2021-02-23

    申请号:US16368538

    申请日:2019-03-28

    Abstract: Systems and methods for assisted indirect memory addressing are provided. Some computing systems move data between levels of a hierarchical memory system. To accommodate data movement for computing systems that do not natively support indirect addressing between levels of the memory hierarchy, a direct memory access (DMA) engine is used to fetch data. The DMA engine executes a first set of memory instructions that modify a second set of memory instructions to fetch data stored at one level of the memory hierarchy from dynamically computed indirect addresses stored in memory locations at another level of the memory hierarchy.

Patent Agency Ranking