Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Linhao ZHANG"

1.

发明申请
TASK EXECUTION METHOD AND APPARATUS FOR LARGE MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20250094534A1

公开(公告)日：2025-03-20

申请号：US18968798

申请日：2024-12-04

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Linhao ZHANG , Yilong CHEN , Junyuan SHANG , Yinqi YANG , Shuohuan WANG , Yu SUN

IPC: G06F17/16

Abstract: A task execution method for a large model relates to fields of artificial intelligence, deep learning and large model technologies, and includes executing attention tasks in a task group to be fused using a target computing unit to obtain attention features, where the attention task corresponds to a weighted matrix to be fused, the weighted matrix to be fused is obtained by weighting a matrix to be fused using a weight; obtaining a processing result according to the attention features; determining a loss information according to the processing result; and weighting and fusing matrices to be fused using the target computing unit according to weights for the task group to be fused if the loss information converges, to obtain a fusion matrix for a target task group, where a target task in the target task group is executed by the target computing unit according to the fusion matrix.

2.

发明申请
TRAINING METHOD FOR A DEEP LEARNING MODEL 有权

公开(公告)号：US20250061305A1

公开(公告)日：2025-02-20

申请号：US18936686

申请日：2024-11-04

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Shuohuan WANG , Junyuan SHANG , Yinqi YANG , Guoxia WANG , Linhao ZHANG , Yu SUN , Hua WU , Haifeng WANG

IPC: G06N3/043 , G06N3/045 , G06N3/0985

Abstract: A training method, an inference method, a device, an apparatus, and a medium for a deep learning model are provided. A first model includes a plurality of first parameters, a second model comprises a plurality of second parameters, which is initialized to parameter values of a plurality of target parameters selected from the plurality of first parameters. The training method includes: determining a target loss for both the first model and the second model; adjusting parameter values, including: in response to determining that the target loss indicates that the parameter values of at least part of the target parameters need to be adjusted, synchronously adjusting the parameter values of the corresponding second parameters; and in response to determining that the target loss indicates that the parameter values of at least part of the second parameters need to be adjusted, synchronously adjusting the parameter values of the corresponding target parameters.

Patent Agency Ranking