-
1.
公开(公告)号:US11556761B2
公开(公告)日:2023-01-17
申请号:US16828277
申请日:2020-03-24
发明人: Xiang Li , Yuhui Sun , Jingwei Li , Jialiang Jiang
摘要: A method for compressing a neural network model includes: obtaining a first trained teacher model and a second trained teacher model based on N training samples, N being a positive integer greater than 1; for each of the N training samples, determining a first guide component of the first teacher model and a second guide component of the second teacher model respectively, determining a sub optimization target corresponding to the training sample and configured to optimize a student model according to the first guide component and the second guide component, and determining a joint optimization target based on each of the N training samples and a sub optimization target corresponding to the training sample; and training the student model based on the joint optimization target.
-
公开(公告)号:US11507888B2
公开(公告)日:2022-11-22
申请号:US16840054
申请日:2020-04-03
发明人: Yuhui Sun , Xiang Li , Jingwei Li
摘要: A training method for a machine translation model, includes: obtaining a multi-domain mixed training data set; performing data domain classification on a plurality of training data pairs in the training data set to obtain at least two domain data subsets; based on each domain data subset, determining at least two candidate optimization targets for the domain data subset, and training at least two candidate single domain models corresponding to each domain data subset based on the at least two candidate optimization targets, respectively; testing the at least two candidate single domain models corresponding to each domain data subset separately, and selecting a candidate optimization target with a highest test accuracy as a designated optimization target for the domain data subset; and training a hybrid domain model based on each domain data subset in the training data set and the designated optimization target corresponding to each domain data subset.
-
公开(公告)号:US11461561B2
公开(公告)日:2022-10-04
申请号:US16744768
申请日:2020-01-16
发明人: Xiang Li , Yuhui Sun , Xiaolin Wu , Jianwei Cui
摘要: A method for information processing, includes: obtaining a bilingual vocabulary containing N original bilingual word pairs, N being a positive integer; obtaining an original bilingual training set containing multiple original bilingual training sentence pairs; selecting at least one original bilingual training sentence pair matching any original bilingual word from the original bilingual training set as a bilingual sentence pair candidate; constructing a generalized bilingual sentence pattern based on at least one bilingual sentence pair candidate; and obtaining an augmented bilingual training set containing multiple augmented bilingual training sentence pairs, based on the bilingual vocabulary and the generalized bilingual sentence pattern.
-
公开(公告)号:US20210124880A1
公开(公告)日:2021-04-29
申请号:US16744768
申请日:2020-01-16
发明人: Xiang Li , Yuhui Sun , Xiaolin Wu , Jianwei Cui
摘要: A method for information processing, includes: obtaining a bilingual vocabulary containing N original bilingual word pairs, N being a positive integer; obtaining an original bilingual training set containing multiple original bilingual training sentence pairs; selecting at least one original bilingual training sentence pair matching any original bilingual word from the original bilingual training set as a bilingual sentence pair candidate; constructing a generalized bilingual sentence pattern based on at least one bilingual sentence pair candidate; and obtaining an augmented bilingual training set containing multiple augmented bilingual training sentence pairs, based on the bilingual vocabulary and the generalized bilingual sentence pattern.
-
公开(公告)号:US11556723B2
公开(公告)日:2023-01-17
申请号:US16785062
申请日:2020-02-07
发明人: Xiang Li , Yuhui Sun , Jialiang Jiang , Jianwei Cui
摘要: A method for compressing a neural network model, includes: obtaining a set of training samples including a plurality of pairs of training samples, each pair of the training samples including source data and target data corresponding to the source data; training an original teacher model by using the source data as an input and using the target data as verification data; training intermediate teacher models based on the set of training samples and the original teacher model, one or more intermediate teacher models forming a set of teacher models; training multiple candidate student models based on the set of training samples, the original teacher model, and the set of teacher models, the multiple candidate student models forming a set of student models; and selecting a candidate student model of the multiple candidate student models as a target student model according to training results of the multiple candidate student models.
-
-
-
-