METHOD AND APPARATUS FOR DISTRIBUTING NETWORK LAYERS IN NEURAL NETWORK MODEL

    公开(公告)号:US20230206075A1

    公开(公告)日:2023-06-29

    申请号:US17991077

    申请日:2022-11-21

    CPC classification number: G06N3/082 G06N3/04

    Abstract: A method for distributing network layers in a neural network model includes: acquiring a to-be-processed neural network model and a computing device set; generating a target number of distribution schemes according to network layers in the to-be-processed neural network model and computing devices in the computing device set, the distribution schemes including corresponding relationships between the network layers and the computing devices; according to device types of the computing devices, combining the network layers corresponding to the same device type in each distribution scheme into one stage, to obtain a combination result of each distribution scheme; obtaining an adaptive value of each distribution scheme according to the combination result of each distribution scheme; and determining a target distribution scheme from the distribution schemes according to respective adaptive value, and taking the target distribution scheme as a distribution result of the network layers in the to-be-processed neural network model.

    METHOD AND SYSTEM OF TRAINING DEEP LEARNING MODEL, DEVICE, AND MEDIUM

    公开(公告)号:US20240394190A1

    公开(公告)日:2024-11-28

    申请号:US18696757

    申请日:2022-09-27

    Abstract: The present application provides a method of training a deep learning model. A specific implementation solution of the method of training the deep learning model includes: determining, according to first training data for a current training round, a first target parameter required to be written into a target memory in a first network parameter required by an embedding of the first training data, wherein the target memory is a memory contained in a target processor; determining a remaining storage slot in the target memory according to a first mapping relationship between a storage slot of the target memory and a network parameter; and writing, in response to the remaining storage slot meeting a storage requirement of the first target parameter, the first target parameter into the target memory so that a computing core contained in the target processor adjusts the first network parameter according to the first training data.

Patent Agency Ranking