Method, product, and apparatus for a multidimensional processing array for hardware acceleration of convolutional neural network inference

    公开(公告)号:US11687831B1

    公开(公告)日:2023-06-27

    申请号:US16946674

    申请日:2020-06-30

    CPC classification number: G06N20/00

    Abstract: An approach includes receiving a machine learning processing job, executing the machine learning processing job using parallel processing of multiple output pixels each cycle by walking data across processing elements with broadcast weights within regions and executing parallel multiplication operations, and generating an output indicating whether the machine learning processing job was successful or failed. In some embodiments, a schedule of actions is generated for respective machine learning processing jobs. The schedule of actions may include any of a plurality of shift operations in a many to many arrangement or a one to many arrangement, shifting data across region boundaries, fetching data and weights from a memory and distribution thereof to a plurality of regions (e.g., weights are distributed to respective weight memories which subsequently broadcasts those weights in a specified order based on a schedule of actions, and where data is distributed to respective processing elements).

    Method, product, and apparatus for variable precision weight management for neural networks

    公开(公告)号:US11615320B1

    公开(公告)日:2023-03-28

    申请号:US16946675

    申请日:2020-06-30

    Abstract: An approach includes identification of a machine learning model for processing and generating an ordered set of weights with varying precisions and metadata that specifies where those values can be found in order to allow the identification of weights needed during processing. In a first embodiment, the variable precision weights are separated into different memory segments where each segment has weights of only a single precision. In a second embodiment, the variable precision weights are provided in a memory where weights of different precisions are intermingled, and those weights are identified using a sequence of pairs of data representing a number of weights with the same precision and the precision of those weights. In some embodiments, both the first and second embodiments are combined, where some segments contain weights with only a single precision and at least one segment stores weights with different precisions within a respective segment.

    Method, product, and apparatus for a machine learning process leveraging input sparsity on a pixel by pixel basis

    公开(公告)号:US11676068B1

    公开(公告)日:2023-06-13

    申请号:US16946673

    申请日:2020-06-30

    CPC classification number: G06N20/00 G06F17/16

    Abstract: An approach includes a method, product, and apparatus for dynamically removing sparse data on a pixel by pixel basis. In some embodiments, a machine learning processing job is received. The machine learning processing job is then executed on a pixel by pixel basis by selecting non-zero data values for input into a systolic array, wherein sparse data is not selected for input into the systolic array. Subsequently, a message is generated that provides an indication of whether the execution completed successfully. In some embodiments, the machine learning processing job comprises at least a plurality of multiply and accumulate operations. In some embodiments, at least one data value equal to zero for the machine learning processing job is not input into a systolic array. In some embodiments, a plurality of weights are input into a plurality of columns for each cycle.

    Method, product, and apparatus for a machine learning process using dynamic rearrangement of sparse data and corresponding weights

    公开(公告)号:US11651283B1

    公开(公告)日:2023-05-16

    申请号:US16946672

    申请日:2020-06-30

    CPC classification number: G06N20/00 G06F17/16

    Abstract: An approach is described for a method, product, and apparatus for a machine learning process using dynamic rearrangement of sparse data and corresponding weights. This approach includes a method, product, and apparatus for dynamically rearranging input data to move sparse data to a location such that computations on the sparse data might be avoided when executing a machine learning processing job. For example, sparse data within each row of the input matrix can be moved to the end of each corresponding row. When the input data is folded to fit the array, that sparse data might be at least partially contained within a fold that comprises only sparse data and possibly filler data. In such an event, computations on the fold are unnecessary and are avoided. In some embodiments, the approach includes dynamically rearranging a weight matrix to maintain a correspondence between the input data and the weights.

Patent Agency Ranking