COMPILING MODELS FOR DEDICATED HARDWARE

    公开(公告)号:US20250131286A1

    公开(公告)日:2025-04-24

    申请号:US19000562

    申请日:2024-12-23

    Applicant: Apple Inc.

    Abstract: The subject technology provides receiving a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations and some of the operations being executable on multiple processors of the target platform. The subject technology further sorts the operations from the multiple layers in a particular order based at least in part on grouping the operations that are executable by a particular processor of the multiple processors. The subject technology determines, based at least in part on a cost of transferring the operations between the multiple processors, an assignment of one of the multiple processors for each of the sorted operations of each of the layers in a manner that minimizes a total cost of executing the operations. Further, for each layer of the NN model, the subject technology includes an annotation to indicate the processor assigned for each of the operations.

    APPLICATION DEPLOYMENT
    3.
    发明公开

    公开(公告)号:US20230376293A1

    公开(公告)日:2023-11-23

    申请号:US18199344

    申请日:2023-05-18

    Applicant: Apple Inc.

    CPC classification number: G06F8/443 G06F8/30 G06F21/52

    Abstract: The present disclosure generally relates deploying an application. Some techniques described herein occur during compile time while executable code is being generated from source code. In one example, the executable code causes different operations in an application to be assigned to different compute systems such that particular operations are required to be executed on particular compute systems. The executable code may further include bridges that assist data being transmitted between different compute systems, the bridges generated during compile time. In another example, the executable code causes data to be sent to a recording service during execution of an application. The recording service, though not included in the source code before compile time, is configured to receive copies of data transmitted on a compute system including the recording service. The recording service may also be configured to receive metadata corresponding to operations executed on the compute system.

    COMPILING MODELS FOR DEDICATED HARDWARE
    4.
    发明申请

    公开(公告)号:US20200082274A1

    公开(公告)日:2020-03-12

    申请号:US16262809

    申请日:2019-01-30

    Applicant: Apple Inc.

    Abstract: The subject technology provides receiving a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations and some of the operations being executable on multiple processors of the target platform. The subject technology further sorts the operations from the multiple layers in a particular order based at least in part on grouping the operations that are executable by a particular processor of the multiple processors. The subject technology determines, based at least in part on a cost of transferring the operations between the multiple processors, an assignment of one of the multiple processors for each of the sorted operations of each of the layers in a manner that minimizes a total cost of executing the operations. Further, for each layer of the NN model, the subject technology includes an annotation to indicate the processor assigned for each of the operations.

    COMPILING MODELS FOR DEDICATED HARDWARE
    5.
    发明申请

    公开(公告)号:US20200082273A1

    公开(公告)日:2020-03-12

    申请号:US16262807

    申请日:2019-01-30

    Applicant: Apple Inc.

    Abstract: The subject technology runs a compiled neural network (NN) model on a particular processor with multiple priority queues for executing different processes, the compiled NN model being assigned to a particular priority queue, and the compiled NN model includes context switch instructions that were previously inserted into a neural network (NN) model from which the compiled NN model was compiled. The subject technology determines that a particular context switch instruction has been executed by the particular processor. The subject technology determines that a different process is waiting to be executed, the different process being assigned to a different priority queue and the different process being a higher priority process than the running compiled NN model. In response to executing the particular context switch instruction, the subject technology performs a context switch to the different process assigned to the different priority queue when the different process is waiting to be executed.

Patent Agency Ranking