-
公开(公告)号:US20230252354A1
公开(公告)日:2023-08-10
申请号:US18179627
申请日:2023-03-07
Inventor: Junyuan SHANG , Shuohuan WANG , Siyu DING , Yanbin ZHAO , Chao PANG , Yu SUN , Hao TIAN , Hua WU , Haifeng WANG
IPC: G06N20/00 , G06F40/40 , G06F40/279
CPC classification number: G06N20/00 , G06F40/40 , G06F40/279
Abstract: A method for pre-training a language model includes: constructing a pre-training language data set, in which the pre-training language data set comprises unsupervised language data and supervised language data; generating a hierarchical multi-template and multi-task language data set based on the pre-training language data set; and pre-training the language model based on the hierarchical multi-template and multi-task language data set.
-
公开(公告)号:US20250094713A1
公开(公告)日:2025-03-20
申请号:US18967529
申请日:2024-12-03
Inventor: Shuohuan WANG , Yekun CHAI , Siyu DING , Junyuan SHANG , Zhenyu ZHANG , Yu SUN , Hao TIAN , Hua WU , Haifeng WANG
IPC: G06F40/284 , G06F16/3329
Abstract: A multimodal data generation method is provided. The method includes: inputting a query data sequence into a multimodal model, to obtain a plurality of tokens in a response data sequence, where a current token is generated through the following operations: inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model generates the current token based on the query data sequence and the current response data sequence, in response to determining that the current token belongs to a first data modality; or inputting the query data sequence and a current response data sequence into the multimodal model, so that the multimodal model denoises an initial token sequence based on the query data sequence and the current response data sequence, to generate a result token sequence, in response to determining that the current token belongs to a second data modality.
-
公开(公告)号:US20230206080A1
公开(公告)日:2023-06-29
申请号:US18118339
申请日:2023-03-07
Inventor: Shuohuan WANG , Weibao GONG , Zhihua WU , Yu SUN , Siyu DING , Yaqian HAN , Yanbin ZHAO , Yuang LIU , Dianhai YU
Abstract: A model training system includes at least one first cluster and a second cluster communicating with the at least first cluster. The at least one first cluster is configured to acquire a sample data set, generate training data according to the sample data set, and send the training data to the second cluster; and the second cluster is configured to train a pre-trained model according to the training data sent by the at least one first cluster.
-
公开(公告)号:US20220327290A1
公开(公告)日:2022-10-13
申请号:US17852413
申请日:2022-06-29
Inventor: Junyuan SHANG , Shuohuan WANG , Siyu DING
Abstract: There is provided a method of training a feature determination model, which relates to a field of deep learning and natural language processing. The method is implemented to include: determining, by a plurality of feature determination layers arranged in stages, a feature vector for each segment in a pre-training text; and pre-training the feature determination model according to the feature vector. A current stage feature vector is determined by a feature determination layer of a current stage according to a preceding segment feature vector determined for a preceding segment, and a preceding stage feature vector determined by a feature determination layer of a preceding stage. A method of training a feature determination model for a target task, a method of performing semantic analysis for a target task, an electronic device, and a computer storage medium are also provided.
-
公开(公告)号:US20210312139A1
公开(公告)日:2021-10-07
申请号:US17353884
申请日:2021-06-22
Inventor: Shuohuan WANG , Siyu DING , Junyuan SHANG , Yu SUN
Abstract: A method and apparatus of generating a semantic feature, a method and apparatus of training a model, an electronic device, and a storage medium are provided. The method of generating the semantic feature includes: segmenting a target document to obtain a segment sequence of the target document; generating a semantic feature of each document segment in the segment sequence of the target document by using a pre-trained bidirectional semantic encoding model; and acquiring the semantic feature of the target document based on the semantic feature of the each document segment in the segment sequence of the target document. The present disclosure further provides a method of training a bidirectional semantic encoding model.
-
公开(公告)号:US20240412002A1
公开(公告)日:2024-12-12
申请号:US18747641
申请日:2024-06-19
Inventor: Yanbin ZHAO , Siyu DING , Shuohuan WANG , Yu SUN , Hao TIAN , Hua WU , Haifeng WANG
IPC: G06F40/35
Abstract: A method is provided. The method includes: obtaining a first sample dataset; inputting at least one first question text corresponding to at least one piece of first sample data into a dialog model separately to obtain at least one first answer prediction result; inputting each second question text into the dialog model to obtain a second answer prediction result output by the dialog model; inputting the second answer prediction result into a reward model to obtain a score of the second answer prediction result output by the reward model; determining a comprehensive loss based on the at least one first answer prediction result, a first answer text of each of the at least one piece of first sample data, and a score corresponding to each of at least one piece of second sample data; and adjusting at least one parameter of the dialog model based on the comprehensive loss.
-
公开(公告)号:US20230040095A1
公开(公告)日:2023-02-09
申请号:US17889218
申请日:2022-08-16
Inventor: Junyuan SHANG , Shuohuan WANG , Siyu DING , Yanbin ZHAO , Chao PANG , Yu Sun
IPC: G06F40/40 , G06F40/289
Abstract: A method and apparatus for pre-training a model, a device, a storage medium, and a program product. An embodiment of the method includes: acquiring a sample natural language text; generating N types of prompt words based on the sample natural language text, where N is a positive integer; generating sample input data based on the sample natural language text and the N types of prompt words; and training an initial language model based on the sample input data, to obtain a pre-trained language model.
-
公开(公告)号:US20220293092A1
公开(公告)日:2022-09-15
申请号:US17828773
申请日:2022-05-31
Inventor: Siyu DING , Chao PANG , Shuohuan WANG , Yanbin ZHAO , Junyuan SHANG , Yu SUN , Shikun FENG , Hao TIAN , Hua WU , Haifeng WANG
Abstract: The present application provides a method of training a natural language processing model, which relates to a field of artificial intelligence, and in particular to a field of natural language processing. A specific implementation scheme includes: performing a semantic learning for multi-tasks on an input text, so as to obtain a semantic feature for the multi-tasks, wherein the multi-tasks include a plurality of branch tasks; performing a feature learning for each branch task based on the semantic feature, so as to obtain a first output result for each branch task; calculating a loss for each branch task according to the first output result for the branch task; and adjusting a parameter of the natural language processing model according to the loss for each branch task. The present application further provides a method of processing a natural language, an electronic device, and a storage medium.
-
-
-
-
-
-
-