-
1.
公开(公告)号:US20230222344A1
公开(公告)日:2023-07-13
申请号:US18118859
申请日:2023-03-08
Inventor: Yekun Chai , Shuohuan Wang , Yu Sun
IPC: G06N3/082
CPC classification number: G06N3/082
Abstract: A method for determining a prompt vector of a pre-trained model, includes: obtaining a first one of prompt vectors and a first vector corresponding to sample data; obtaining N pruned models by N different pruning processing on the pre-trained model, where N is any integer greater than 1; obtaining a first score corresponding to the first one of the prompt vectors by fusing the first vector and the first one of the prompt vectors and inputting the fused first vector and first one of the prompt vectors into the N pruned models respectively; determining a second one of the prompt vectors by modifying, based on the first score, the first one of the prompt vectors; and based on the second one of the prompt vectors, returning to obtaining the first score until determining a target prompt vector corresponding to the sample data.
-
公开(公告)号:US20220414474A1
公开(公告)日:2022-12-29
申请号:US17901803
申请日:2022-09-01
Inventor: Hongjian Shi , Xinwei Feng , Feifei Li , Chenyang Guo , Xueqian Wu , Meng Tian , Yu Sun
IPC: G06N3/08 , G06F16/953
Abstract: A search method based on a neural network model is provided. The neural network model includes a semantic representation model, a recall model, and a ranking model. The present disclosure relates to the field of artificial intelligence, and in particular to the technical field of search. An implementation of the method comprises: inputting a target search and a plurality of objects to be matched to the semantic representation model to obtain a first output of the semantic representation model; inputting the first output of the semantic representation model to the recall model, and obtaining at least one recall object matching the target search from the plurality of objects to be matched by using the recall model; and inputting a second output of the semantic representation model to the ranking model, and obtaining a matching value of each of the at least one recall object by using the ranking model.
-
公开(公告)号:US20250094806A1
公开(公告)日:2025-03-20
申请号:US18967167
申请日:2024-12-03
Inventor: Junyuan Shang , Yilong Chen , Zhenyu Zhang , Shuohuan Wang , Yu Sun , Hua Wu
IPC: G06N3/082 , G06N3/0475
Abstract: Provided is a large language model training method, an electronic device and a storage medium, relating to the field of artificial intelligence technologies, and in particular, to the fields of deep learning, natural language processing and large model. The method includes: performing dimension reduction parameter fusion on a two-dimensional parameter matrix on each channel in each network layer in a first large language model, respectively, to obtain a second large language model; performing layer reduction parameter fusion on network layers in the second large language model based on a three-dimensional parameter matrix of each network layer in the second large language model to obtain a third large language model; and training the third large language model to obtain a target large language model under the condition that the target loss function determined based on the first and third large language models meets a preset first function condition.
-
公开(公告)号:US12223279B2
公开(公告)日:2025-02-11
申请号:US18054608
申请日:2022-11-11
Inventor: Yaqian Han , Shuohuan Wang , Yu Sun
Abstract: A method for generating a cross-lingual textual semantic model includes: acquiring a set of training data that includes pieces of monolingual non-parallel text and pieces of bilingual parallel text; determining a semantic vector of each piece of text in the set of training data by inputting each piece of text into an initial textual semantic model; determining a distance between semantic vectors of each two pieces of text in the set of training data based on the semantic vector of each piece of text in the set of training data; determining a gradient modification based on a parallel relationship between each two pieces of text in the set of training data and the distance between the semantic vectors of each two pieces of text in the set of training data; and acquiring a modified textual semantic model by modifying the initial textual semantic model based on the gradient modification.
-
公开(公告)号:US12131728B2
公开(公告)日:2024-10-29
申请号:US17828773
申请日:2022-05-31
Inventor: Siyu Ding , Chao Pang , Shuohuan Wang , Yanbin Zhao , Junyuan Shang , Yu Sun , Shikun Feng , Hao Tian , Hua Wu , Haifeng Wang
CPC classification number: G10L15/063 , G10L15/02 , G10L15/18
Abstract: The present application provides a method of training a natural language processing model, which relates to a field of artificial intelligence, and in particular to a field of natural language processing. A specific implementation scheme includes: performing a semantic learning for multi-tasks on an input text, so as to obtain a semantic feature for the multi-tasks, wherein the multi-tasks include a plurality of branch tasks; performing a feature learning for each branch task based on the semantic feature, so as to obtain a first output result for each branch task; calculating a loss for each branch task according to the first output result for the branch task; and adjusting a parameter of the natural language processing model according to the loss for each branch task. The present application further provides a method of processing a natural language, an electronic device, and a storage medium.
-
公开(公告)号:US20230040095A1
公开(公告)日:2023-02-09
申请号:US17889218
申请日:2022-08-16
Inventor: Junyuan SHANG , Shuohuan WANG , Siyu DING , Yanbin ZHAO , Chao PANG , Yu Sun
IPC: G06F40/40 , G06F40/289
Abstract: A method and apparatus for pre-training a model, a device, a storage medium, and a program product. An embodiment of the method includes: acquiring a sample natural language text; generating N types of prompt words based on the sample natural language text, where N is a positive integer; generating sample input data based on the sample natural language text and the N types of prompt words; and training an initial language model based on the sample input data, to obtain a pre-trained language model.
-
-
-
-
-