Patent search ap:("Beijing Baidu Netcom Science Technology Co. Page Ltd.") AND inv:"Dianhai YU"

1.

发明申请
MIXTURE-OF-EXPERTS MODEL IMPLEMENTATION METHOD AND SYSTEM, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20250036920A1

公开(公告)日：2025-01-30

申请号：US18026140

申请日：2022-09-20

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Liang SHEN , Haifeng WANG , Huachao WU , Weibao GONG , Zhihua WU , Dianhai YU

IPC: G06N3/045 , G06N3/0495

Abstract: The present disclosure provides a mixture-of-experts (MoE) model implementation method and system, an electronic device, and a storage medium, and relates to the field of artificial intelligence (AI) such as deep learning and distributed storage. The method includes: constructing a communication group, the communication group including a tensor-parallelism communication group, the tensor-parallelism communication group including at least two computing devices, tensor-parallelism segmentation being adopted for sparse parameters of each of the computing devices in a same tensor-parallelism communication group; and training an MoE model based on the communication group. By use of the solutions of the present disclosure, normal operation of model training can be guaranteed.

2.

发明申请
METHOD AND APPARATUS FOR PERFORMING DISTRIBUTED TRAINING ON DEEP LEARNING MODEL, DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220374713A1

公开(公告)日：2022-11-24

申请号：US17880070

申请日：2022-08-03

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Zhihua WU , Dianhai YU , Yulong AO , Weibao GONG

IPC: G06N3/08

Abstract: The present disclosure provides a method and apparatus for performing distributed training on a deep learning model. The method may include: generating a distributed computation view based on data information of a to-be-trained deep learning model; generating a cluster resource view based on property information of a cluster hardware resource corresponding to the to-be-trained deep learning model; determining a target segmentation strategy of a distributed training task based on the distributed computation view and the cluster resource view; and performing distributed training on the to-be-trained deep learning model based on the target segmentation strategy.

3.

发明申请
DEEP LEARNING FRAMEWORK SCHEDULING 有权

公开(公告)号：US20220222111A1

公开(公告)日：2022-07-14

申请号：US17707895

申请日：2022-03-29

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Haifeng Wang , Xiaoguang HU , Dianhai YU , Yanjun MA , Tian WU

IPC: G06F9/48 , G06F9/50

Abstract: A scheduling method for a deep learning framework, a scheduling apparatus, an electronic device, a storage medium, and a program product is provided, and can be used in the field of artificial intelligence, especially in the fields of machine learning, deep learning, etc. The method includes: receiving a processing request for processing a plurality of tasks by using a dedicated processing unit, the processing request including scheduling requirements for the plurality of tasks, and each of the plurality of tasks being associated with execution of multi-batch data processing; and scheduling, based on the scheduling requirements for the plurality of tasks in batches of data, the dedicated processing unit to process the plurality of tasks.

4.

发明申请
METHOD AND APPARATUS OF TRAINING MODEL, DEVICE, MEDIUM, AND PROGRAM PRODUCT 有权

公开(公告)号：US20220004811A1

公开(公告)日：2022-01-06

申请号：US17479061

申请日：2021-09-20

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Ruoyu GUO , Yuning DU , Weiwei LIU , Xiaoting YIN , Qiao ZHAO , Qiwen LIU , Ran BI , Xiaoguang HU , Dianhai YU , Yanjun MA

IPC: G06K9/62

Abstract: There is provided a method and apparatus of training a model, a device, and a medium, which relate to artificial intelligence, and in particular to a deep learning and image processing technology. The method may include: determining a plurality of augmented sample sets associated with a plurality of original samples; determining a first constraint according to a first model based on the plurality of augmented sample sets; determining a second constraint according to the first model and a second model based on the plurality of augmented sample sets, wherein the second constraint is associated with a difference between outputs of the first model and the second model for one augmented sample, and the first model has a complexity lower than that of the second model; training the first model based on at least the first constraint and the second constraint, so as to obtain a trained first model.

5.

发明公开
MODEL TRAINING METHOD, SYSTEM, DEVICE, AND MEDIUM 审中-公开

公开(公告)号：US20230206080A1

公开(公告)日：2023-06-29

申请号：US18118339

申请日：2023-03-07

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Shuohuan WANG , Weibao GONG , Zhihua WU , Yu SUN , Siyu DING , Yaqian HAN , Yanbin ZHAO , Yuang LIU , Dianhai YU

IPC: G06N3/094 , G06N3/045

CPC classification number: G06N3/094 , G06N3/045

Abstract: A model training system includes at least one first cluster and a second cluster communicating with the at least first cluster. The at least one first cluster is configured to acquire a sample data set, generate training data according to the sample data set, and send the training data to the second cluster; and the second cluster is configured to train a pre-trained model according to the training data sent by the at least one first cluster.

6.

发明公开
METHOD AND APPARATUS FOR DISTRIBUTING NETWORK LAYERS IN NEURAL NETWORK MODEL 审中-公开

公开(公告)号：US20230206075A1

公开(公告)日：2023-06-29

申请号：US17991077

申请日：2022-11-21

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Ji LIU , Zhihua WU , Danlei FENG , Minxu ZHANG , Xinxuan WU , Xuefeng YAO , Beichen MA , Dejing DOU , Dianhai YU , Yanjun MA

IPC: G06N3/08 , G06N3/04

CPC classification number: G06N3/082 , G06N3/04

Abstract: A method for distributing network layers in a neural network model includes: acquiring a to-be-processed neural network model and a computing device set; generating a target number of distribution schemes according to network layers in the to-be-processed neural network model and computing devices in the computing device set, the distribution schemes including corresponding relationships between the network layers and the computing devices; according to device types of the computing devices, combining the network layers corresponding to the same device type in each distribution scheme into one stage, to obtain a combination result of each distribution scheme; obtaining an adaptive value of each distribution scheme according to the combination result of each distribution scheme; and determining a target distribution scheme from the distribution schemes according to respective adaptive value, and taking the target distribution scheme as a distribution result of the network layers in the to-be-processed neural network model.

7.

发明申请
METHOD AND APPARATUS OF PROCESSING IMAGE, DEVICE AND MEDIUM 有权

公开(公告)号：US20210374490A1

公开(公告)日：2021-12-02

申请号：US17400693

申请日：2021-08-12

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Yuning DU , Yehua YANG , Shengyu WEI , Ruoyu GUO , Qiwen LIU , Qiao ZHAO , Ran BI , Xiaoguang HU , Dianhai YU , Yanjun MA

IPC: G06K9/68 , G06T5/50 , G06T7/194 , G06N20/00 , G06K9/20

Abstract: The present disclosure provides a method and apparatus of processing an image, a device and a medium, which relates to a field of artificial intelligence, and in particular to a field of deep learning and image processing. The method includes: determining a background image of the image, wherein the background image describes a background relative to characters in the image; determining a property of characters corresponding to a selected character section of the image; replacing the selected character section with a corresponding section in the background image, so as to obtain an adjusted image; and combining acquired target characters with the adjusted image based on the property.

8.

发明申请
METHOD AND SYSTEM OF TRAINING DEEP LEARNING MODEL, DEVICE, AND MEDIUM 有权

公开(公告)号：US20240394190A1

公开(公告)日：2024-11-28

申请号：US18696757

申请日：2022-09-27

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Minxu ZHANG , Haifeng WANG , Fan ZHANG , Xinxuan WU , Xuefeng YAO , Danlei FENG , Zhihua WU , Zhipeng TAN , Jie DING , Dianhai YU

IPC: G06F12/0873 , G06F12/0815 , G06F15/80

Abstract: The present application provides a method of training a deep learning model. A specific implementation solution of the method of training the deep learning model includes: determining, according to first training data for a current training round, a first target parameter required to be written into a target memory in a first network parameter required by an embedding of the first training data, wherein the target memory is a memory contained in a target processor; determining a remaining storage slot in the target memory according to a first mapping relationship between a storage slot of the target memory and a network parameter; and writing, in response to the remaining storage slot meeting a storage requirement of the first target parameter, the first target parameter into the target memory so that a computing core contained in the target processor adjusts the first network parameter according to the first training data.

9.

发明公开
METHOD FOR TRAINING FEATURE EXTRACTION MODEL, METHOD FOR CLASSIFYING IMAGE, AND RELATED APPARATUSES 审中-公开

公开(公告)号：US20230215148A1

公开(公告)日：2023-07-06

申请号：US18183590

申请日：2023-03-14

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Shuilong DONG , Sensen HE , Shengyu WEI , Cheng CUI , Yuning DU , Tingquan GAO , Shao ZENG , Ying ZHOU , Xueying LYU , Yi LIU , Qiao ZHAO , Qiwen LIU , Ran BI , Xiaoguang HU , Dianhai YU , Yanjun MA

IPC: G06V10/774 , G06V10/40 , G06V10/74 , G06V10/764 , G06V10/776 , G06V10/778

CPC classification number: G06V10/774 , G06V10/40 , G06V10/761 , G06V10/764 , G06V10/776 , G06V10/7784

Abstract: The present disclosure provides a method for training a feature extraction model, a method for classifying an image and related apparatuses, and relates to the field of artificial intelligence technology such as deep learning and image recognition. The scheme comprises: extracting an image feature of each sample image in a sample image set using a basic feature extraction module of an initial feature extraction model, to obtain an initial feature vector set; performing normalization processing on each initial feature vector in the initial feature vector set using a normalization processing module of the initial feature extraction model, to obtain each normalized feature vector; and guiding training for the initial feature extraction model through a preset high discriminative loss function, to obtain a target feature extraction model as a training result.

10.

发明申请
IMAGE PROCESSING 有权

公开(公告)号：US20230085732A1

公开(公告)日：2023-03-23

申请号：US18058543

申请日：2022-11-23

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Yuying HAO , Yi LIU , Zewu WU , Baohua LAI , Zeyu CHEN , Dianhai YU , Yanjun MA , Zhiliang YU , Xueying LV

IPC: G06T7/11

Abstract: The present disclosure provides an image processing method and apparatus, and relates to the field of image processing, and in particular to the field of image annotation. An implementation is: obtaining an image to be processed including a target region to be annotated; in response to a first click on the target region, performing a first operation to expand a predicted region for the target region based on a click position of the first click; in response to a second click in a position where the predicted region exceeds the target region, performing a second operation to reduce the predicted region based on a click position of the second click; and in response to determining that a difference between the predicted region and the target region meets a preset condition, obtaining an outline of the predicted region to annotate the target region.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification