-
公开(公告)号:US20230140997A1
公开(公告)日:2023-05-11
申请号:US18089392
申请日:2022-12-27
Inventor: Ruiqing ZHANG , Xiyang WANG , Zhongjun HE , Zhi LI , Hua WU
IPC: G06F40/58
CPC classification number: G06F40/58
Abstract: A method and apparatus for selecting a sample corpus used to optimize a translation model, an electronic device, a computer readable storage medium, and a computer program product are provided. The method includes: after acquiring a first corpus, translating the first corpus by using a to-be-optimized translation model to acquire a second corpus with different types of languages, then translating the second corpus by using the to-be-optimized translation model to acquire a third corpus, then determining a difficulty level of the first corpus based on a similarity between the first corpus and the third corpus, and finally determining the first corpus as a sample corpus used to perform optimization training on the to-be-optimized translation model in response to the difficulty level satisfying requirements of a difficulty level threshold.
-
公开(公告)号:US20230196026A1
公开(公告)日:2023-06-22
申请号:US18109813
申请日:2023-02-14
Inventor: Xiyang WANG , Ruiqing ZHANG , Zhongjun HE , Zhi LI , Hua WU
IPC: G06F40/30 , G06F40/289
CPC classification number: G06F40/30 , G06F40/289
Abstract: A method for evaluating a text content, which may include: after splitting a to-be-evaluated text into a plurality of clauses arranged in sequence according to punctuation information of the to-be-evaluated text, determining a first clause of the plurality of clauses as an actual tune name; then, determining actual prosodic information based on a Chinese phonetic alphabet text of a third clause to a last clause in response to that a number of clauses, whose numbers of Chinese characters satisfy character count requirements of clauses corresponding to the actual tune name, from the third clause to the last clause exceeds a number threshold; and finally, in response to the actual prosodic information being consistent with a standard prosodic information of the actual tune name, evaluating the to-be-evaluated text as a Ci-poetry text.
-
公开(公告)号:US20230153548A1
公开(公告)日:2023-05-18
申请号:US17885152
申请日:2022-08-10
Inventor: Ruiqing ZHANG , Xiyang WANG , Zhongjun HE , Zhi LI , Hua WU
IPC: G06F40/58
CPC classification number: G06F40/58
Abstract: A translation method, an electronic device and a storage medium, which relate to the field of artificial intelligence technologies, such as machine learning technologies, information processing technologies, are disclosed. An implementation includes: acquiring an intermediate translation result generated by each of multiple pre-trained translation models for a to-be-translated specified sentence in a same iteration of a translation process, so as to obtain multiple intermediate translation results; acquiring a co-occurrence word based on the multiple intermediate translation results; and acquiring a target translation result of the specified sentence based on the co-occurrence word.
-
公开(公告)号:US20230076471A1
公开(公告)日:2023-03-09
申请号:US17982965
申请日:2022-11-08
Inventor: Xiyang WANG , Ruiqing ZHANG , Zhongjun HE , Zhi LI , Hua WU
Abstract: A training method, a text translation method, an electronic device, and a storage medium, which relate to a field of artificial intelligence, in particular to fields of natural language processing and deep learning technologies. A specific implementation solution includes: performing a feature extraction on source sample text data to obtain a sample feature vector sequence; obtaining a target sample feature vector according to the sample feature vector sequence; performing an autoregressive decoding and a non-autoregressive decoding on the sample feature vector sequence, respectively; performing a length prediction on the target sample feature vector; training a predetermined model by using translation sample data, the autoregressive text translation result, the non-autoregressive text translation result, a true length value of the source sample text, the first predicted length value, a true length value of the translation sample text, and the second predicted length value to obtain the text translation model.
-
公开(公告)号:US20230153543A1
公开(公告)日:2023-05-18
申请号:US17951216
申请日:2022-09-23
Inventor: Ruiqing ZHANG , Xiyang WANG , Hui LIU , Zhongjun HE , Zhi LI , Hua WU
Abstract: A translation method, a model training method, apparatuses, electronic devices and storage mediums, which relate to the field of artificial intelligence technologies, such as machine learning technologies, information processing technologies, are disclosed. In an implementation, a weight for each translation model in at least two pre-trained translation models translating a to-be-translated specified sentence is acquired based on the specified sentence and a pre-trained weighting model; and the specified sentence is translating using the at least two translation models based on the weight for each translation model translating the specified sentence.
-
公开(公告)号:US20230051373A1
公开(公告)日:2023-02-16
申请号:US17974317
申请日:2022-10-26
Inventor: Xiyang WANG , Ruiqing ZHANG , Zhongjun HE , Zhi LI , Hua WU
IPC: G06F40/47
Abstract: A method for training a non-autoregressive translation (NAT) model includes: acquiring a source language text, a target language text corresponding to the source language text and a target length of the target language text; generating a target language prediction text and a prediction length by inputting the source language text into the NAT model, in which initialization parameters of the NAT model are determined based on parameters of a pre-trained translation model; and obtaining a target NAT model by training the NAT model based on the target language text, the target language prediction text, the target length and the prediction length.
-
-
-
-
-