-
公开(公告)号:US20230274144A1
公开(公告)日:2023-08-31
申请号:US18192211
申请日:2023-03-29
Applicant: HUAWEI TECHNOLOGIES CO., LTD.
Inventor: Xiaozhe REN , Yichun YIN , Xin JIANG
Abstract: This application relates to the field of artificial intelligence, and provides a model training method. The method includes: obtaining a to-be-trained first neural network model, where the first neural network model includes a first operator, and the first operator is used to perform a product operation on input data and a target weight matrix; replacing the first operator in the first neural network model with a second operator, to obtain a second neural network model, where the second operator is used to perform a product operation on input data and a plurality of sub-weight matrices, and the plurality of sub-weight matrices are obtained by performing matrix factorization on the target weight matrix; and performing model training on the second neural network model to obtain a target neural network model.
-
公开(公告)号:US20220147715A1
公开(公告)日:2022-05-12
申请号:US17526832
申请日:2021-11-15
Applicant: Huawei Technologies Co., Ltd. , TSINGHUA UNIVERSITY
Inventor: Yasheng WANG , Xin JIANG , Xiao CHEN , Qun LIU , Zhengyan ZHANG , Fanchao QI , Zhiyuan LIU
IPC: G06F40/295
Abstract: This application relates to the field of artificial intelligence, and provides a text processing method, a model training method, and an apparatus. The method includes: obtaining target knowledge data; processing the target knowledge data to obtain a target knowledge vector; processing to-be-processed text to obtain a target text vector; fusing the target text vector and the target knowledge vector based on a target fusion model, to obtain a fused target text vector and a fused target knowledge vector; and processing the fused target text vector and/or the fused target knowledge vector based on a target processing model, to obtain a processing result corresponding to a target task. The foregoing technical solution can improve accuracy of a result of processing a target task by the target processing model.
-