-
公开(公告)号:US20240095541A1
公开(公告)日:2024-03-21
申请号:US17946409
申请日:2022-09-16
Applicant: Apple Inc.
Inventor: Sayyed Karen Khatamifard , Thomas G. Anderl , Alexander J. Kirchhoff , Keith Wyss , Dylan H. Rush , Chenfan Sun , Jeffrey D Marker
Abstract: Embodiments relate to compiling neural network operations into tasks that may be performed in a streaming manner by a neural processor. In a streaming operation, a tensor is spatially partitioned, and tasks associated two or more layers of the neural network are performed simultaneously in an overlapping manner. To enable efficient memory usage during streaming operation, a subset of the tasks having completion times close in time are assigned to a same portion of memory in the neural processor during a compilation process. After the tasks assigned to the same portion of the memory is finished, the portion of the memory may be flushed to make space for subsequent tasks. Multiple tasks may also be coalesced into a single task to reduce the number of tasks and more efficiently perform the operations at the neural processor.
-
公开(公告)号:US20230394276A1
公开(公告)日:2023-12-07
申请号:US17833476
申请日:2022-06-06
Applicant: Apple Inc.
Inventor: Sayyed Karen Khatamifard , Chenfan Sun , Alon Yaakov , Husam Khashiboun , Jeffrey D. Marker , Saman Naderiparizi , Ramana V. Rachakonda , Rohit K. Gupta
CPC classification number: G06N3/04 , G06F9/4881 , G06F9/5016
Abstract: Embodiments relate to streaming convolution operations in a neural processor circuit that includes a neural engine circuit and a neural task manager. The neural task manager obtains multiple task descriptors and multiple subtask descriptors. Each task descriptor identifies a respective set of the convolution operations of a respective layer of a set of layers. Each subtask descriptor identifies a corresponding task descriptor and a subset of the convolution operations on a portion of a layer of the set of layers identified by the corresponding task descriptor. The neural processor circuit configures the neural engine circuit for execution of the subset of the convolution operations using the corresponding task descriptor. The neural engine circuit performs the subset of the convolution operations to generate output data that correspond to input data of another subset of the convolution operations identified by another subtask descriptor from the list of subtask descriptors.
-
公开(公告)号:US20230368008A1
公开(公告)日:2023-11-16
申请号:US17745032
申请日:2022-05-16
Applicant: Apple Inc.
Inventor: Sayyed Karen Khatamifard , Alexander J. Kirchhoff , Rohit K. Gupta , Jeffrey D. Marker , Thomas G. Anderl , Saman Naderiparizi , Chenfan Sun , Alon Yaakov , Husam Khashiboun , Ramana V. Rachakonda
IPC: G06N3/063
CPC classification number: G06N3/063
Abstract: Embodiments relate to streaming operations in a neural processor circuit that includes a neural engine circuit and a data processor circuit. The neural engine circuit performs first operations on a first input tensor of a first layer to generate a first output tensor, and second operations on a second input tensor of a second layer at a higher hierarchy than the first layer, the second input tensor corresponding to the first output tensor. The data processor circuit stores a portion of the first input tensor for access by the neural engine circuit to perform a subset of the first operations and generate a portion of the first output tensor. The data processor circuit stores the portion of the first output tensor for access by the neural engine circuit as a portion of the second input tensor to perform a subset of the second operations.
-
公开(公告)号:US11657124B2
公开(公告)日:2023-05-23
申请号:US16215540
申请日:2018-12-10
Applicant: Apple Inc.
Inventor: Peter Zatloukal , Matthew Weaver , Alexander Kirchhoff , Dmitry Belenko , Ali Farhadi , Mohammad Rastegari , Andrew Luke Chronister , Keith Patrick Wyss , Chenfan Sun
CPC classification number: G06F21/105 , G06F21/12 , G06N3/08 , G06N3/10 , H04L9/0891 , H04L9/30 , G06F2221/0755
Abstract: In one embodiment, a method includes receiving a user request from a client device associated with a user, accessing an instructional file comprising one or more binary inference engines and one or more encrypted model data corresponding to the one or more binary inference engines, respectively, selecting a binary inference engine from the one or more binary inference engines in the accessed instructional file based on the user request, sending a validation request for a permission to execute the binary inference engine to a licensing server, receiving the permission from the licensing server, decrypting the encrypted model data corresponding to the binary inference engine by a decryption key, executing the binary inference engine based on the user request and the decrypted model data, and sending one or more execution results responsive to the execution of the binary inference engine to the client device.
-
-
-