Patent search ap:("Beijing Baidu Netcom Science Technology Co. Page Ltd.") AND inv:"Shuohuan Wang"

1.

发明申请
LARGE LANGUAGE MODEL TRAINING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20250094806A1

公开(公告)日：2025-03-20

申请号：US18967167

申请日：2024-12-03

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Junyuan Shang , Yilong Chen , Zhenyu Zhang , Shuohuan Wang , Yu Sun , Hua Wu

IPC: G06N3/082 , G06N3/0475

Abstract: Provided is a large language model training method, an electronic device and a storage medium, relating to the field of artificial intelligence technologies, and in particular, to the fields of deep learning, natural language processing and large model. The method includes: performing dimension reduction parameter fusion on a two-dimensional parameter matrix on each channel in each network layer in a first large language model, respectively, to obtain a second large language model; performing layer reduction parameter fusion on network layers in the second large language model based on a three-dimensional parameter matrix of each network layer in the second large language model to obtain a third large language model; and training the third large language model to obtain a target large language model under the condition that the target loss function determined based on the first and third large language models meets a preset first function condition.

2.

发明授权
Method for generating cross-lingual textual semantic model, and electronic device 有权

公开(公告)号：US12223279B2

公开(公告)日：2025-02-11

申请号：US18054608

申请日：2022-11-11

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Yaqian Han , Shuohuan Wang , Yu Sun

IPC: G06F40/30 , G06N5/022

Abstract: A method for generating a cross-lingual textual semantic model includes: acquiring a set of training data that includes pieces of monolingual non-parallel text and pieces of bilingual parallel text; determining a semantic vector of each piece of text in the set of training data by inputting each piece of text into an initial textual semantic model; determining a distance between semantic vectors of each two pieces of text in the set of training data based on the semantic vector of each piece of text in the set of training data; determining a gradient modification based on a parallel relationship between each two pieces of text in the set of training data and the distance between the semantic vectors of each two pieces of text in the set of training data; and acquiring a modified textual semantic model by modifying the initial textual semantic model based on the gradient modification.

3.

发明授权
Method and apparatus of training natural language processing model, and method and apparatus of processing natural language 有权

公开(公告)号：US12131728B2

公开(公告)日：2024-10-29

申请号：US17828773

申请日：2022-05-31

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Siyu Ding , Chao Pang , Shuohuan Wang , Yanbin Zhao , Junyuan Shang , Yu Sun , Shikun Feng , Hao Tian , Hua Wu , Haifeng Wang

IPC: G10L15/00 , G10L15/02 , G10L15/06 , G10L15/18

CPC classification number: G10L15/063 , G10L15/02 , G10L15/18

Abstract: The present application provides a method of training a natural language processing model, which relates to a field of artificial intelligence, and in particular to a field of natural language processing. A specific implementation scheme includes: performing a semantic learning for multi-tasks on an input text, so as to obtain a semantic feature for the multi-tasks, wherein the multi-tasks include a plurality of branch tasks; performing a feature learning for each branch task based on the semantic feature, so as to obtain a first output result for each branch task; calculating a loss for each branch task according to the first output result for the branch task; and adjusting a parameter of the natural language processing model according to the loss for each branch task. The present application further provides a method of processing a natural language, an electronic device, and a storage medium.

4.

发明公开
METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM FOR DETERMINING PROMPT VECTOR OF PRE-TRAINED MODEL 审中-公开

公开(公告)号：US20230222344A1

公开(公告)日：2023-07-13

申请号：US18118859

申请日：2023-03-08

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Yekun Chai , Shuohuan Wang , Yu Sun

IPC: G06N3/082

CPC classification number: G06N3/082

Abstract: A method for determining a prompt vector of a pre-trained model, includes: obtaining a first one of prompt vectors and a first vector corresponding to sample data; obtaining N pruned models by N different pruning processing on the pre-trained model, where N is any integer greater than 1; obtaining a first score corresponding to the first one of the prompt vectors by fusing the first vector and the first one of the prompt vectors and inputting the fused first vector and first one of the prompt vectors into the N pruned models respectively; determining a second one of the prompt vectors by modifying, based on the first score, the first one of the prompt vectors; and based on the second one of the prompt vectors, returning to obtaining the first score until determining a target prompt vector corresponding to the sample data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification