-
公开(公告)号:US20250131286A1
公开(公告)日:2025-04-24
申请号:US19000562
申请日:2024-12-23
Applicant: Apple Inc.
Inventor: Gaurav KAPOOR , Cecile M. FORET , Francesco ROSSI , Kit-Man WAN , Umesh S. VAISHAMPAYAN , Etienne BELANGER , Albert ANTONY , Alexey MARINICHEV , Marco ZULIANI , Xiaojin SHI
Abstract: The subject technology provides receiving a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations and some of the operations being executable on multiple processors of the target platform. The subject technology further sorts the operations from the multiple layers in a particular order based at least in part on grouping the operations that are executable by a particular processor of the multiple processors. The subject technology determines, based at least in part on a cost of transferring the operations between the multiple processors, an assignment of one of the multiple processors for each of the sorted operations of each of the layers in a manner that minimizes a total cost of executing the operations. Further, for each layer of the NN model, the subject technology includes an annotation to indicate the processor assigned for each of the operations.
-
公开(公告)号:US20230177350A1
公开(公告)日:2023-06-08
申请号:US17903991
申请日:2022-09-06
Applicant: Apple Inc.
Inventor: Gaurav KAPOOR , Cecile M. FORET , Francesco ROSSI , Kit-Man WAN , Umesh S. VAISHAMPAYAN , Etienne BELANGER , Albert ANTONY , Alexey MARINICHEV , Marco ZULIANI , Xiaojin SHI
CPC classification number: G06N3/10 , G06N3/08 , G06N3/04 , G06F8/443 , G06F8/41 , G06F8/4441 , G06N3/063 , G06F9/50
Abstract: The subject technology provides receiving a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations and some of the operations being executable on multiple processors of the target platform. The subject technology further sorts the operations from the multiple layers in a particular order based at least in part on grouping the operations that are executable by a particular processor of the multiple processors. The subject technology determines, based at least in part on a cost of transferring the operations between the multiple processors, an assignment of one of the multiple processors for each of the sorted operations of each of the layers in a manner that minimizes a total cost of executing the operations. Further, for each layer of the NN model, the subject technology includes an annotation to indicate the processor assigned for each of the operations.
-
公开(公告)号:US20210398021A1
公开(公告)日:2021-12-23
申请号:US17347563
申请日:2021-06-14
Applicant: Apple Inc.
Inventor: Umesh S. VAISHAMPAYAN , Gaurav KAPOOR , Kit-Man WAN
IPC: G06N20/00
Abstract: A device implementing a system to execute machine learning models from memory includes at least one processor configured to receive a request to provide an input to one or more machine learning (ML) models arranged into a graph of connected layers, the one or more ML models stored in the first type of memory. The at least one processor is further configured to divide the graph of connected layers into a plurality of segments such that at least two of the plurality of segments concurrently fits within allocated space of the second type of memory. The at least one processor is further configured to cause the input to be processed through the first segment of the plurality of segments using the second type of memory while a second segment of the plurality of segments is concurrently loaded from the first type of memory into the second type of memory.
-
公开(公告)号:US20240028890A1
公开(公告)日:2024-01-25
申请号:US18225656
申请日:2023-07-24
Applicant: Apple Inc.
Inventor: Abhishek BHOWMICK , Ryan M. ROGERS , Umesh S. VAISHAMPAYAN , Andrew H. VYRROS
Abstract: Embodiments described herein provide a technique to crowdsource labeling of training data for a machine learning model while maintaining the privacy of the data provided by crowdsourcing participants. Client devices can be used to generate proposed labels for a unit of data to be used in a training dataset. One or more privacy mechanisms are used to protect user data when transmitting the data to a server. The server can aggregate the proposed labels and use the most frequently proposed labels for an element as the label for the element when generating training data for the machine learning model. The machine learning model is then trained using the crowdsourced labels to improve the accuracy of the model.
-
公开(公告)号:US20210397957A1
公开(公告)日:2021-12-23
申请号:US17349843
申请日:2021-06-16
Applicant: Apple Inc.
Inventor: Umesh S. VAISHAMPAYAN , Kit-Man WAN , Aaftab A. MUNSHI , Cecile M. FORET , Yen-Fu LIU
Abstract: The subject technology provides a framework for multi-processor training of neural networks. Multi-processor training of neural networks can include performing a forward pass of a training iteration using a neural processor, and performing a backward pass of the training iteration using a CPU or a GPU. Additional operations for facilitating the multi-processor training are disclosed.
-
公开(公告)号:US20200382616A1
公开(公告)日:2020-12-03
申请号:US16554518
申请日:2019-08-28
Applicant: Apple Inc.
Inventor: Umesh S. VAISHAMPAYAN , Gaurav KAPOOR , Kit-man WAN
IPC: H04L29/08 , G06N20/00 , H04W4/021 , G06F3/0488
Abstract: In an exemplary process for remote execution of machine-learned models, one or more signals from a second electronic device is detected by a first electronic device. The second electronic device includes a machine-learned model associated with an application implemented on the first electronic device. Based on the one or more signals, a communication connection is established with the second electronic device and a proxy to the machine-learned model is generated. Input data is obtained via a sensor of the first electronic device. A representation of the input data is sent to the second electronic device via the proxy and the established communication connection. The representation of the input data is processed through the machine-learned model to generate an output. A result derived from the output is received via the communication connection and a representation of the result is outputted.
-
公开(公告)号:US20200082274A1
公开(公告)日:2020-03-12
申请号:US16262809
申请日:2019-01-30
Applicant: Apple Inc.
Inventor: Francesco ROSSI , Cecile M. FORET , Gaurav KAPOOR , Kit-Man WAN , Umesh S. VAISHAMPAYAN , Etienne BELANGER , Albert ANTONY , Alexey MARINICHEV , Marco ZULIANI , Xiaojin SHI
Abstract: The subject technology provides receiving a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations and some of the operations being executable on multiple processors of the target platform. The subject technology further sorts the operations from the multiple layers in a particular order based at least in part on grouping the operations that are executable by a particular processor of the multiple processors. The subject technology determines, based at least in part on a cost of transferring the operations between the multiple processors, an assignment of one of the multiple processors for each of the sorted operations of each of the layers in a manner that minimizes a total cost of executing the operations. Further, for each layer of the NN model, the subject technology includes an annotation to indicate the processor assigned for each of the operations.
-
公开(公告)号:US20200082273A1
公开(公告)日:2020-03-12
申请号:US16262807
申请日:2019-01-30
Applicant: Apple Inc.
Inventor: Francesco ROSSI , Cecile M. FORET , Gaurav KAPOOR , Kit-Man WAN , Umesh S. VAISHAMPAYAN , Etienne BELANGER
Abstract: The subject technology runs a compiled neural network (NN) model on a particular processor with multiple priority queues for executing different processes, the compiled NN model being assigned to a particular priority queue, and the compiled NN model includes context switch instructions that were previously inserted into a neural network (NN) model from which the compiled NN model was compiled. The subject technology determines that a particular context switch instruction has been executed by the particular processor. The subject technology determines that a different process is waiting to be executed, the different process being assigned to a different priority queue and the different process being a higher priority process than the running compiled NN model. In response to executing the particular context switch instruction, the subject technology performs a context switch to the different process assigned to the different priority queue when the different process is waiting to be executed.
-
-
-
-
-
-
-