-
公开(公告)号:US20230206075A1
公开(公告)日:2023-06-29
申请号:US17991077
申请日:2022-11-21
Inventor: Ji LIU , Zhihua WU , Danlei FENG , Minxu ZHANG , Xinxuan WU , Xuefeng YAO , Beichen MA , Dejing DOU , Dianhai YU , Yanjun MA
Abstract: A method for distributing network layers in a neural network model includes: acquiring a to-be-processed neural network model and a computing device set; generating a target number of distribution schemes according to network layers in the to-be-processed neural network model and computing devices in the computing device set, the distribution schemes including corresponding relationships between the network layers and the computing devices; according to device types of the computing devices, combining the network layers corresponding to the same device type in each distribution scheme into one stage, to obtain a combination result of each distribution scheme; obtaining an adaptive value of each distribution scheme according to the combination result of each distribution scheme; and determining a target distribution scheme from the distribution schemes according to respective adaptive value, and taking the target distribution scheme as a distribution result of the network layers in the to-be-processed neural network model.
-
公开(公告)号:US20240394190A1
公开(公告)日:2024-11-28
申请号:US18696757
申请日:2022-09-27
Inventor: Minxu ZHANG , Haifeng WANG , Fan ZHANG , Xinxuan WU , Xuefeng YAO , Danlei FENG , Zhihua WU , Zhipeng TAN , Jie DING , Dianhai YU
IPC: G06F12/0873 , G06F12/0815 , G06F15/80
Abstract: The present application provides a method of training a deep learning model. A specific implementation solution of the method of training the deep learning model includes: determining, according to first training data for a current training round, a first target parameter required to be written into a target memory in a first network parameter required by an embedding of the first training data, wherein the target memory is a memory contained in a target processor; determining a remaining storage slot in the target memory according to a first mapping relationship between a storage slot of the target memory and a network parameter; and writing, in response to the remaining storage slot meeting a storage requirement of the first target parameter, the first target parameter into the target memory so that a computing core contained in the target processor adjusts the first network parameter according to the first training data.
-