-
1.
公开(公告)号:US20250036920A1
公开(公告)日:2025-01-30
申请号:US18026140
申请日:2022-09-20
Inventor: Liang SHEN , Haifeng WANG , Huachao WU , Weibao GONG , Zhihua WU , Dianhai YU
IPC: G06N3/045 , G06N3/0495
Abstract: The present disclosure provides a mixture-of-experts (MoE) model implementation method and system, an electronic device, and a storage medium, and relates to the field of artificial intelligence (AI) such as deep learning and distributed storage. The method includes: constructing a communication group, the communication group including a tensor-parallelism communication group, the tensor-parallelism communication group including at least two computing devices, tensor-parallelism segmentation being adopted for sparse parameters of each of the computing devices in a same tensor-parallelism communication group; and training an MoE model based on the communication group. By use of the solutions of the present disclosure, normal operation of model training can be guaranteed.
-
公开(公告)号:US20220374713A1
公开(公告)日:2022-11-24
申请号:US17880070
申请日:2022-08-03
Inventor: Zhihua WU , Dianhai YU , Yulong AO , Weibao GONG
IPC: G06N3/08
Abstract: The present disclosure provides a method and apparatus for performing distributed training on a deep learning model. The method may include: generating a distributed computation view based on data information of a to-be-trained deep learning model; generating a cluster resource view based on property information of a cluster hardware resource corresponding to the to-be-trained deep learning model; determining a target segmentation strategy of a distributed training task based on the distributed computation view and the cluster resource view; and performing distributed training on the to-be-trained deep learning model based on the target segmentation strategy.
-
公开(公告)号:US20220222111A1
公开(公告)日:2022-07-14
申请号:US17707895
申请日:2022-03-29
Inventor: Haifeng Wang , Xiaoguang HU , Dianhai YU , Yanjun MA , Tian WU
Abstract: A scheduling method for a deep learning framework, a scheduling apparatus, an electronic device, a storage medium, and a program product is provided, and can be used in the field of artificial intelligence, especially in the fields of machine learning, deep learning, etc. The method includes: receiving a processing request for processing a plurality of tasks by using a dedicated processing unit, the processing request including scheduling requirements for the plurality of tasks, and each of the plurality of tasks being associated with execution of multi-batch data processing; and scheduling, based on the scheduling requirements for the plurality of tasks in batches of data, the dedicated processing unit to process the plurality of tasks.
-
公开(公告)号:US20220004811A1
公开(公告)日:2022-01-06
申请号:US17479061
申请日:2021-09-20
Inventor: Ruoyu GUO , Yuning DU , Weiwei LIU , Xiaoting YIN , Qiao ZHAO , Qiwen LIU , Ran BI , Xiaoguang HU , Dianhai YU , Yanjun MA
IPC: G06K9/62
Abstract: There is provided a method and apparatus of training a model, a device, and a medium, which relate to artificial intelligence, and in particular to a deep learning and image processing technology. The method may include: determining a plurality of augmented sample sets associated with a plurality of original samples; determining a first constraint according to a first model based on the plurality of augmented sample sets; determining a second constraint according to the first model and a second model based on the plurality of augmented sample sets, wherein the second constraint is associated with a difference between outputs of the first model and the second model for one augmented sample, and the first model has a complexity lower than that of the second model; training the first model based on at least the first constraint and the second constraint, so as to obtain a trained first model.
-
公开(公告)号:US20230206080A1
公开(公告)日:2023-06-29
申请号:US18118339
申请日:2023-03-07
Inventor: Shuohuan WANG , Weibao GONG , Zhihua WU , Yu SUN , Siyu DING , Yaqian HAN , Yanbin ZHAO , Yuang LIU , Dianhai YU
Abstract: A model training system includes at least one first cluster and a second cluster communicating with the at least first cluster. The at least one first cluster is configured to acquire a sample data set, generate training data according to the sample data set, and send the training data to the second cluster; and the second cluster is configured to train a pre-trained model according to the training data sent by the at least one first cluster.
-
公开(公告)号:US20230206075A1
公开(公告)日:2023-06-29
申请号:US17991077
申请日:2022-11-21
Inventor: Ji LIU , Zhihua WU , Danlei FENG , Minxu ZHANG , Xinxuan WU , Xuefeng YAO , Beichen MA , Dejing DOU , Dianhai YU , Yanjun MA
Abstract: A method for distributing network layers in a neural network model includes: acquiring a to-be-processed neural network model and a computing device set; generating a target number of distribution schemes according to network layers in the to-be-processed neural network model and computing devices in the computing device set, the distribution schemes including corresponding relationships between the network layers and the computing devices; according to device types of the computing devices, combining the network layers corresponding to the same device type in each distribution scheme into one stage, to obtain a combination result of each distribution scheme; obtaining an adaptive value of each distribution scheme according to the combination result of each distribution scheme; and determining a target distribution scheme from the distribution schemes according to respective adaptive value, and taking the target distribution scheme as a distribution result of the network layers in the to-be-processed neural network model.
-
公开(公告)号:US20210374490A1
公开(公告)日:2021-12-02
申请号:US17400693
申请日:2021-08-12
Inventor: Yuning DU , Yehua YANG , Shengyu WEI , Ruoyu GUO , Qiwen LIU , Qiao ZHAO , Ran BI , Xiaoguang HU , Dianhai YU , Yanjun MA
Abstract: The present disclosure provides a method and apparatus of processing an image, a device and a medium, which relates to a field of artificial intelligence, and in particular to a field of deep learning and image processing. The method includes: determining a background image of the image, wherein the background image describes a background relative to characters in the image; determining a property of characters corresponding to a selected character section of the image; replacing the selected character section with a corresponding section in the background image, so as to obtain an adjusted image; and combining acquired target characters with the adjusted image based on the property.
-
公开(公告)号:US20240394190A1
公开(公告)日:2024-11-28
申请号:US18696757
申请日:2022-09-27
Inventor: Minxu ZHANG , Haifeng WANG , Fan ZHANG , Xinxuan WU , Xuefeng YAO , Danlei FENG , Zhihua WU , Zhipeng TAN , Jie DING , Dianhai YU
IPC: G06F12/0873 , G06F12/0815 , G06F15/80
Abstract: The present application provides a method of training a deep learning model. A specific implementation solution of the method of training the deep learning model includes: determining, according to first training data for a current training round, a first target parameter required to be written into a target memory in a first network parameter required by an embedding of the first training data, wherein the target memory is a memory contained in a target processor; determining a remaining storage slot in the target memory according to a first mapping relationship between a storage slot of the target memory and a network parameter; and writing, in response to the remaining storage slot meeting a storage requirement of the first target parameter, the first target parameter into the target memory so that a computing core contained in the target processor adjusts the first network parameter according to the first training data.
-
9.
公开(公告)号:US20230215148A1
公开(公告)日:2023-07-06
申请号:US18183590
申请日:2023-03-14
Inventor: Shuilong DONG , Sensen HE , Shengyu WEI , Cheng CUI , Yuning DU , Tingquan GAO , Shao ZENG , Ying ZHOU , Xueying LYU , Yi LIU , Qiao ZHAO , Qiwen LIU , Ran BI , Xiaoguang HU , Dianhai YU , Yanjun MA
IPC: G06V10/774 , G06V10/40 , G06V10/74 , G06V10/764 , G06V10/776 , G06V10/778
CPC classification number: G06V10/774 , G06V10/40 , G06V10/761 , G06V10/764 , G06V10/776 , G06V10/7784
Abstract: The present disclosure provides a method for training a feature extraction model, a method for classifying an image and related apparatuses, and relates to the field of artificial intelligence technology such as deep learning and image recognition. The scheme comprises: extracting an image feature of each sample image in a sample image set using a basic feature extraction module of an initial feature extraction model, to obtain an initial feature vector set; performing normalization processing on each initial feature vector in the initial feature vector set using a normalization processing module of the initial feature extraction model, to obtain each normalized feature vector; and guiding training for the initial feature extraction model through a preset high discriminative loss function, to obtain a target feature extraction model as a training result.
-
公开(公告)号:US20230085732A1
公开(公告)日:2023-03-23
申请号:US18058543
申请日:2022-11-23
Inventor: Yuying HAO , Yi LIU , Zewu WU , Baohua LAI , Zeyu CHEN , Dianhai YU , Yanjun MA , Zhiliang YU , Xueying LV
IPC: G06T7/11
Abstract: The present disclosure provides an image processing method and apparatus, and relates to the field of image processing, and in particular to the field of image annotation. An implementation is: obtaining an image to be processed including a target region to be annotated; in response to a first click on the target region, performing a first operation to expand a predicted region for the target region based on a click position of the first click; in response to a second click in a position where the predicted region exceeds the target region, performing a second operation to reduce the predicted region based on a click position of the second click; and in response to determining that a difference between the predicted region and the target region meets a preset condition, obtaining an outline of the predicted region to annotate the target region.
-
-
-
-
-
-
-
-
-