Highly performant pipeline parallel deep neural network training

    公开(公告)号:US12056604B2

    公开(公告)日:2024-08-06

    申请号:US16024369

    申请日:2018-06-29

    IPC分类号: G06N3/08 G06N3/04

    CPC分类号: G06N3/08 G06N3/04

    摘要: Layers of a deep neural network (DNN) are partitioned into stages using a profile of the DNN. Each of the stages includes one or more of the layers of the DNN. The partitioning of the layers of the DNN into stages is optimized in various ways including optimizing the partitioning to minimize training time, to minimize data communication between worker computing devices used to train the DNN, or to ensure that the worker computing devices perform an approximately equal amount of the processing for training the DNN. The stages are assigned to the worker computing devices. The worker computing devices process batches of training data using a scheduling policy that causes the workers to alternate between forward processing of the batches of the DNN training data and backward processing of the batches of the DNN training data. The stages can be configured for model parallel processing or data parallel processing.