Patent search ap:("Beijing Baidu Netcom Science Technology Co. Page Ltd.") AND inv:"Yilong Chen"

1.

发明申请
LARGE LANGUAGE MODEL TRAINING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20250094806A1

公开(公告)日：2025-03-20

申请号：US18967167

申请日：2024-12-03

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Junyuan Shang , Yilong Chen , Zhenyu Zhang , Shuohuan Wang , Yu Sun , Hua Wu

IPC: G06N3/082 , G06N3/0475

Abstract: Provided is a large language model training method, an electronic device and a storage medium, relating to the field of artificial intelligence technologies, and in particular, to the fields of deep learning, natural language processing and large model. The method includes: performing dimension reduction parameter fusion on a two-dimensional parameter matrix on each channel in each network layer in a first large language model, respectively, to obtain a second large language model; performing layer reduction parameter fusion on network layers in the second large language model based on a three-dimensional parameter matrix of each network layer in the second large language model to obtain a third large language model; and training the third large language model to obtain a target large language model under the condition that the target loss function determined based on the first and third large language models meets a preset first function condition.

Patent Agency Ranking