COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR

    公开(公告)号:US20240095541A1

    公开(公告)日:2024-03-21

    申请号:US17946409

    申请日:2022-09-16

    Applicant: Apple Inc.

    CPC classification number: G06N3/10 G06N3/063

    Abstract: Embodiments relate to compiling neural network operations into tasks that may be performed in a streaming manner by a neural processor. In a streaming operation, a tensor is spatially partitioned, and tasks associated two or more layers of the neural network are performed simultaneously in an overlapping manner. To enable efficient memory usage during streaming operation, a subset of the tasks having completion times close in time are assigned to a same portion of memory in the neural processor during a compilation process. After the tasks assigned to the same portion of the memory is finished, the portion of the memory may be flushed to make space for subsequent tasks. Multiple tasks may also be coalesced into a single task to reduce the number of tasks and more efficiently perform the operations at the neural processor.

Patent Agency Ranking