Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Zhihua WU"

1.

发明申请
MIXTURE-OF-EXPERTS MODEL IMPLEMENTATION METHOD AND SYSTEM, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20250036920A1

公开(公告)日：2025-01-30

申请号：US18026140

申请日：2022-09-20

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Liang SHEN , Haifeng WANG , Huachao WU , Weibao GONG , Zhihua WU , Dianhai YU

IPC: G06N3/045 , G06N3/0495

Abstract: The present disclosure provides a mixture-of-experts (MoE) model implementation method and system, an electronic device, and a storage medium, and relates to the field of artificial intelligence (AI) such as deep learning and distributed storage. The method includes: constructing a communication group, the communication group including a tensor-parallelism communication group, the tensor-parallelism communication group including at least two computing devices, tensor-parallelism segmentation being adopted for sparse parameters of each of the computing devices in a same tensor-parallelism communication group; and training an MoE model based on the communication group. By use of the solutions of the present disclosure, normal operation of model training can be guaranteed.

2.

发明申请
METHOD AND APPARATUS FOR PERFORMING DISTRIBUTED TRAINING ON DEEP LEARNING MODEL, DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220374713A1

公开(公告)日：2022-11-24

申请号：US17880070

申请日：2022-08-03

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Zhihua WU , Dianhai YU , Yulong AO , Weibao GONG

IPC: G06N3/08

Abstract: The present disclosure provides a method and apparatus for performing distributed training on a deep learning model. The method may include: generating a distributed computation view based on data information of a to-be-trained deep learning model; generating a cluster resource view based on property information of a cluster hardware resource corresponding to the to-be-trained deep learning model; determining a target segmentation strategy of a distributed training task based on the distributed computation view and the cluster resource view; and performing distributed training on the to-be-trained deep learning model based on the target segmentation strategy.

3.

发明公开
MODEL TRAINING METHOD, SYSTEM, DEVICE, AND MEDIUM 审中-公开

公开(公告)号：US20230206080A1

公开(公告)日：2023-06-29

申请号：US18118339

申请日：2023-03-07

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Shuohuan WANG , Weibao GONG , Zhihua WU , Yu SUN , Siyu DING , Yaqian HAN , Yanbin ZHAO , Yuang LIU , Dianhai YU

IPC: G06N3/094 , G06N3/045

CPC classification number: G06N3/094 , G06N3/045

Abstract: A model training system includes at least one first cluster and a second cluster communicating with the at least first cluster. The at least one first cluster is configured to acquire a sample data set, generate training data according to the sample data set, and send the training data to the second cluster; and the second cluster is configured to train a pre-trained model according to the training data sent by the at least one first cluster.

4.

发明公开
METHOD AND APPARATUS FOR DISTRIBUTING NETWORK LAYERS IN NEURAL NETWORK MODEL 审中-公开

公开(公告)号：US20230206075A1

公开(公告)日：2023-06-29

申请号：US17991077

申请日：2022-11-21

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Ji LIU , Zhihua WU , Danlei FENG , Minxu ZHANG , Xinxuan WU , Xuefeng YAO , Beichen MA , Dejing DOU , Dianhai YU , Yanjun MA

IPC: G06N3/08 , G06N3/04

CPC classification number: G06N3/082 , G06N3/04

Abstract: A method for distributing network layers in a neural network model includes: acquiring a to-be-processed neural network model and a computing device set; generating a target number of distribution schemes according to network layers in the to-be-processed neural network model and computing devices in the computing device set, the distribution schemes including corresponding relationships between the network layers and the computing devices; according to device types of the computing devices, combining the network layers corresponding to the same device type in each distribution scheme into one stage, to obtain a combination result of each distribution scheme; obtaining an adaptive value of each distribution scheme according to the combination result of each distribution scheme; and determining a target distribution scheme from the distribution schemes according to respective adaptive value, and taking the target distribution scheme as a distribution result of the network layers in the to-be-processed neural network model.

5.

发明申请
METHOD AND SYSTEM OF TRAINING DEEP LEARNING MODEL, DEVICE, AND MEDIUM 有权

公开(公告)号：US20240394190A1

公开(公告)日：2024-11-28

申请号：US18696757

申请日：2022-09-27

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Minxu ZHANG , Haifeng WANG , Fan ZHANG , Xinxuan WU , Xuefeng YAO , Danlei FENG , Zhihua WU , Zhipeng TAN , Jie DING , Dianhai YU

IPC: G06F12/0873 , G06F12/0815 , G06F15/80

Abstract: The present application provides a method of training a deep learning model. A specific implementation solution of the method of training the deep learning model includes: determining, according to first training data for a current training round, a first target parameter required to be written into a target memory in a first network parameter required by an embedding of the first training data, wherein the target memory is a memory contained in a target processor; determining a remaining storage slot in the target memory according to a first mapping relationship between a storage slot of the target memory and a network parameter; and writing, in response to the remaining storage slot meeting a storage requirement of the first target parameter, the first target parameter into the target memory so that a computing core contained in the target processor adjusts the first network parameter according to the first training data.

6.

发明申请
METHOD AND APPARATUS OF PROCESSING INFORMATION, METHOD AND APPARATUS OF RECOMMENDING INFORMATION, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20220058222A1

公开(公告)日：2022-02-24

申请号：US17517703

申请日：2021-11-03

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Mo CHENG , Dianhai YU , Lin MA , Zhihua WU , Daxiang DONG , Wei TANG

IPC: G06F16/901 , G06F16/28 , G06N5/02

Abstract: The present disclosure provides a method of processing information, an apparatus of processing information, a method of recommending information, an electronic device, and a storage medium. The method includes: obtaining a tree structure parameter of a tree structure, wherein the tree structure is configured to index an object set used for recommendation; obtaining a classifier parameter of a classifier, wherein the classifier is configured to sequentially predict, from a top layer of the tree structure to a bottom layer of the tree structure, a preference node set whose probability of being preferred by a user is ranked higher in each layer, and a preference node set of each layer subsequent to the top layer of the tree structure is determined based on a preference node set of a previous layer of the each layer; and constructing a recalling model based on the tree structure parameter and the classifier parameter.

7.

发明公开
CONTENT INITIALIZATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20240275848A1

公开(公告)日：2024-08-15

申请号：US18020618

申请日：2022-08-01

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Guoxia WANG , Long LI , Zhihua WU

IPC: H04L67/1097 , G06F7/58

CPC classification number: H04L67/1097 , G06F7/582 , G06F7/588

Abstract: The present disclosure provides a content initialization method and apparatus, an electronic device and a storage medium, which relates to a field of computer technology, in particular to fields of deep learning and distributed computing. The content initialization method is applied to any one of a plurality of devices included in a distributed system. A specific implementation scheme of the content initialization method is: determining, according to a size information of a resource space for the distributed system and an identification information of the any one of the plurality of devices, a space information of a first sub-space for the any one of the plurality of devices in the resource space, wherein the space information includes a position information of the first sub-space for the resource space; and determining an initialization content for the first sub-space according to a random seed and the position information.

8.

发明申请
Neural Network Training Method and Apparatus, Electronic Device, Medium and Program Product 有权

公开(公告)号：US20220374704A1

公开(公告)日：2022-11-24

申请号：US17558355

申请日：2021-12-21

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Danlei FENG , Long LIAN , Dianhai YU , Xuefeng YAO , Xinxuan WU , Zhihua WU , Yanjun MA

IPC: G06N3/08 , G06N3/04

Abstract: The disclosure provides a neural network training method and apparatus, an electronic device, a medium and a program product, and relates to the field of artificial intelligence, in particular to the fields of deep learning and distributed learning. The method includes: acquiring a neural network for deep learning; constructing a deep reinforcement learning model for the neural network; and determining, through the deep reinforcement learning model, a processing unit selection for the plurality of the network layers based on a duration for training each of the network layers by each type of the plurality of types of the processing units, and a cost of each type of the plurality of types of the processing units, wherein the processing unit selection comprises the type of the processing unit to be used for each of the plurality of the network layers, and the processing unit selection is used for making a total cost of the processing units used by the neural network below a cost threshold, in response to a duration for pipelining parallel computing for training the neural network being shorter than a present duration.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification