COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR

    公开(公告)号:US20240095541A1

    公开(公告)日:2024-03-21

    申请号:US17946409

    申请日:2022-09-16

    Applicant: Apple Inc.

    CPC classification number: G06N3/10 G06N3/063

    Abstract: Embodiments relate to compiling neural network operations into tasks that may be performed in a streaming manner by a neural processor. In a streaming operation, a tensor is spatially partitioned, and tasks associated two or more layers of the neural network are performed simultaneously in an overlapping manner. To enable efficient memory usage during streaming operation, a subset of the tasks having completion times close in time are assigned to a same portion of memory in the neural processor during a compilation process. After the tasks assigned to the same portion of the memory is finished, the portion of the memory may be flushed to make space for subsequent tasks. Multiple tasks may also be coalesced into a single task to reduce the number of tasks and more efficiently perform the operations at the neural processor.

    SUBTASK STORAGE FOR STREAMING CONVOLUTIONS IN NEURAL NETWORK PROCESSOR

    公开(公告)号:US20230394276A1

    公开(公告)日:2023-12-07

    申请号:US17833476

    申请日:2022-06-06

    Applicant: Apple Inc.

    CPC classification number: G06N3/04 G06F9/4881 G06F9/5016

    Abstract: Embodiments relate to streaming convolution operations in a neural processor circuit that includes a neural engine circuit and a neural task manager. The neural task manager obtains multiple task descriptors and multiple subtask descriptors. Each task descriptor identifies a respective set of the convolution operations of a respective layer of a set of layers. Each subtask descriptor identifies a corresponding task descriptor and a subset of the convolution operations on a portion of a layer of the set of layers identified by the corresponding task descriptor. The neural processor circuit configures the neural engine circuit for execution of the subset of the convolution operations using the corresponding task descriptor. The neural engine circuit performs the subset of the convolution operations to generate output data that correspond to input data of another subset of the convolution operations identified by another subtask descriptor from the list of subtask descriptors.

Patent Agency Ranking