Neural Network Training Method and Apparatus, Electronic Device, Medium and Program Product

    公开(公告)号:US20220374704A1

    公开(公告)日:2022-11-24

    申请号:US17558355

    申请日:2021-12-21

    Abstract: The disclosure provides a neural network training method and apparatus, an electronic device, a medium and a program product, and relates to the field of artificial intelligence, in particular to the fields of deep learning and distributed learning. The method includes: acquiring a neural network for deep learning; constructing a deep reinforcement learning model for the neural network; and determining, through the deep reinforcement learning model, a processing unit selection for the plurality of the network layers based on a duration for training each of the network layers by each type of the plurality of types of the processing units, and a cost of each type of the plurality of types of the processing units, wherein the processing unit selection comprises the type of the processing unit to be used for each of the plurality of the network layers, and the processing unit selection is used for making a total cost of the processing units used by the neural network below a cost threshold, in response to a duration for pipelining parallel computing for training the neural network being shorter than a present duration.

    METHOD AND APPARATUS FOR DISTRIBUTING NETWORK LAYERS IN NEURAL NETWORK MODEL

    公开(公告)号:US20230206075A1

    公开(公告)日:2023-06-29

    申请号:US17991077

    申请日:2022-11-21

    CPC classification number: G06N3/082 G06N3/04

    Abstract: A method for distributing network layers in a neural network model includes: acquiring a to-be-processed neural network model and a computing device set; generating a target number of distribution schemes according to network layers in the to-be-processed neural network model and computing devices in the computing device set, the distribution schemes including corresponding relationships between the network layers and the computing devices; according to device types of the computing devices, combining the network layers corresponding to the same device type in each distribution scheme into one stage, to obtain a combination result of each distribution scheme; obtaining an adaptive value of each distribution scheme according to the combination result of each distribution scheme; and determining a target distribution scheme from the distribution schemes according to respective adaptive value, and taking the target distribution scheme as a distribution result of the network layers in the to-be-processed neural network model.

    METHOD AND SYSTEM OF TRAINING DEEP LEARNING MODEL, DEVICE, AND MEDIUM

    公开(公告)号:US20240394190A1

    公开(公告)日:2024-11-28

    申请号:US18696757

    申请日:2022-09-27

    Abstract: The present application provides a method of training a deep learning model. A specific implementation solution of the method of training the deep learning model includes: determining, according to first training data for a current training round, a first target parameter required to be written into a target memory in a first network parameter required by an embedding of the first training data, wherein the target memory is a memory contained in a target processor; determining a remaining storage slot in the target memory according to a first mapping relationship between a storage slot of the target memory and a network parameter; and writing, in response to the remaining storage slot meeting a storage requirement of the first target parameter, the first target parameter into the target memory so that a computing core contained in the target processor adjusts the first network parameter according to the first training data.

Patent Agency Ranking