-
公开(公告)号:US20240095541A1
公开(公告)日:2024-03-21
申请号:US17946409
申请日:2022-09-16
Applicant: Apple Inc.
Inventor: Sayyed Karen Khatamifard , Thomas G. Anderl , Alexander J. Kirchhoff , Keith Wyss , Dylan H. Rush , Chenfan Sun , Jeffrey D Marker
Abstract: Embodiments relate to compiling neural network operations into tasks that may be performed in a streaming manner by a neural processor. In a streaming operation, a tensor is spatially partitioned, and tasks associated two or more layers of the neural network are performed simultaneously in an overlapping manner. To enable efficient memory usage during streaming operation, a subset of the tasks having completion times close in time are assigned to a same portion of memory in the neural processor during a compilation process. After the tasks assigned to the same portion of the memory is finished, the portion of the memory may be flushed to make space for subsequent tasks. Multiple tasks may also be coalesced into a single task to reduce the number of tasks and more efficiently perform the operations at the neural processor.