Method and device for compressing a neural network model for machine translation and storage medium

    公开(公告)号:US11556761B2

    公开(公告)日:2023-01-17

    申请号:US16828277

    申请日:2020-03-24

    摘要: A method for compressing a neural network model includes: obtaining a first trained teacher model and a second trained teacher model based on N training samples, N being a positive integer greater than 1; for each of the N training samples, determining a first guide component of the first teacher model and a second guide component of the second teacher model respectively, determining a sub optimization target corresponding to the training sample and configured to optimize a student model according to the first guide component and the second guide component, and determining a joint optimization target based on each of the N training samples and a sub optimization target corresponding to the training sample; and training the student model based on the joint optimization target.

    Training method and device for machine translation model and storage medium

    公开(公告)号:US11507888B2

    公开(公告)日:2022-11-22

    申请号:US16840054

    申请日:2020-04-03

    IPC分类号: G06N20/00 G06F40/58

    摘要: A training method for a machine translation model, includes: obtaining a multi-domain mixed training data set; performing data domain classification on a plurality of training data pairs in the training data set to obtain at least two domain data subsets; based on each domain data subset, determining at least two candidate optimization targets for the domain data subset, and training at least two candidate single domain models corresponding to each domain data subset based on the at least two candidate optimization targets, respectively; testing the at least two candidate single domain models corresponding to each domain data subset separately, and selecting a candidate optimization target with a highest test accuracy as a designated optimization target for the domain data subset; and training a hybrid domain model based on each domain data subset in the training data set and the designated optimization target corresponding to each domain data subset.

    Method and device for information processing, and storage medium

    公开(公告)号:US11461561B2

    公开(公告)日:2022-10-04

    申请号:US16744768

    申请日:2020-01-16

    IPC分类号: G06F40/58 G06N20/00 G06F40/51

    摘要: A method for information processing, includes: obtaining a bilingual vocabulary containing N original bilingual word pairs, N being a positive integer; obtaining an original bilingual training set containing multiple original bilingual training sentence pairs; selecting at least one original bilingual training sentence pair matching any original bilingual word from the original bilingual training set as a bilingual sentence pair candidate; constructing a generalized bilingual sentence pattern based on at least one bilingual sentence pair candidate; and obtaining an augmented bilingual training set containing multiple augmented bilingual training sentence pairs, based on the bilingual vocabulary and the generalized bilingual sentence pattern.

    METHOD AND DEVICE FOR INFORMATION PROCESSING, AND STORAGE MEDIUM

    公开(公告)号:US20210124880A1

    公开(公告)日:2021-04-29

    申请号:US16744768

    申请日:2020-01-16

    IPC分类号: G06F40/58 G06F40/51 G06N20/00

    摘要: A method for information processing, includes: obtaining a bilingual vocabulary containing N original bilingual word pairs, N being a positive integer; obtaining an original bilingual training set containing multiple original bilingual training sentence pairs; selecting at least one original bilingual training sentence pair matching any original bilingual word from the original bilingual training set as a bilingual sentence pair candidate; constructing a generalized bilingual sentence pattern based on at least one bilingual sentence pair candidate; and obtaining an augmented bilingual training set containing multiple augmented bilingual training sentence pairs, based on the bilingual vocabulary and the generalized bilingual sentence pattern.

    Neural network model compression method, corpus translation method and device

    公开(公告)号:US11556723B2

    公开(公告)日:2023-01-17

    申请号:US16785062

    申请日:2020-02-07

    摘要: A method for compressing a neural network model, includes: obtaining a set of training samples including a plurality of pairs of training samples, each pair of the training samples including source data and target data corresponding to the source data; training an original teacher model by using the source data as an input and using the target data as verification data; training intermediate teacher models based on the set of training samples and the original teacher model, one or more intermediate teacher models forming a set of teacher models; training multiple candidate student models based on the set of training samples, the original teacher model, and the set of teacher models, the multiple candidate student models forming a set of student models; and selecting a candidate student model of the multiple candidate student models as a target student model according to training results of the multiple candidate student models.