-
公开(公告)号:US20230140997A1
公开(公告)日:2023-05-11
申请号:US18089392
申请日:2022-12-27
Inventor: Ruiqing ZHANG , Xiyang WANG , Zhongjun HE , Zhi LI , Hua WU
IPC: G06F40/58
CPC classification number: G06F40/58
Abstract: A method and apparatus for selecting a sample corpus used to optimize a translation model, an electronic device, a computer readable storage medium, and a computer program product are provided. The method includes: after acquiring a first corpus, translating the first corpus by using a to-be-optimized translation model to acquire a second corpus with different types of languages, then translating the second corpus by using the to-be-optimized translation model to acquire a third corpus, then determining a difficulty level of the first corpus based on a similarity between the first corpus and the third corpus, and finally determining the first corpus as a sample corpus used to perform optimization training on the to-be-optimized translation model in response to the difficulty level satisfying requirements of a difficulty level threshold.
-
公开(公告)号:US20230196026A1
公开(公告)日:2023-06-22
申请号:US18109813
申请日:2023-02-14
Inventor: Xiyang WANG , Ruiqing ZHANG , Zhongjun HE , Zhi LI , Hua WU
IPC: G06F40/30 , G06F40/289
CPC classification number: G06F40/30 , G06F40/289
Abstract: A method for evaluating a text content, which may include: after splitting a to-be-evaluated text into a plurality of clauses arranged in sequence according to punctuation information of the to-be-evaluated text, determining a first clause of the plurality of clauses as an actual tune name; then, determining actual prosodic information based on a Chinese phonetic alphabet text of a third clause to a last clause in response to that a number of clauses, whose numbers of Chinese characters satisfy character count requirements of clauses corresponding to the actual tune name, from the third clause to the last clause exceeds a number threshold; and finally, in response to the actual prosodic information being consistent with a standard prosodic information of the actual tune name, evaluating the to-be-evaluated text as a Ci-poetry text.
-
公开(公告)号:US20230153548A1
公开(公告)日:2023-05-18
申请号:US17885152
申请日:2022-08-10
Inventor: Ruiqing ZHANG , Xiyang WANG , Zhongjun HE , Zhi LI , Hua WU
IPC: G06F40/58
CPC classification number: G06F40/58
Abstract: A translation method, an electronic device and a storage medium, which relate to the field of artificial intelligence technologies, such as machine learning technologies, information processing technologies, are disclosed. An implementation includes: acquiring an intermediate translation result generated by each of multiple pre-trained translation models for a to-be-translated specified sentence in a same iteration of a translation process, so as to obtain multiple intermediate translation results; acquiring a co-occurrence word based on the multiple intermediate translation results; and acquiring a target translation result of the specified sentence based on the co-occurrence word.
-
公开(公告)号:US20230076471A1
公开(公告)日:2023-03-09
申请号:US17982965
申请日:2022-11-08
Inventor: Xiyang WANG , Ruiqing ZHANG , Zhongjun HE , Zhi LI , Hua WU
Abstract: A training method, a text translation method, an electronic device, and a storage medium, which relate to a field of artificial intelligence, in particular to fields of natural language processing and deep learning technologies. A specific implementation solution includes: performing a feature extraction on source sample text data to obtain a sample feature vector sequence; obtaining a target sample feature vector according to the sample feature vector sequence; performing an autoregressive decoding and a non-autoregressive decoding on the sample feature vector sequence, respectively; performing a length prediction on the target sample feature vector; training a predetermined model by using translation sample data, the autoregressive text translation result, the non-autoregressive text translation result, a true length value of the source sample text, the first predicted length value, a true length value of the translation sample text, and the second predicted length value to obtain the text translation model.
-
公开(公告)号:US20230101401A1
公开(公告)日:2023-03-30
申请号:US18056197
申请日:2022-11-16
Inventor: Ruiqing ZHANG , Zhongjun HE , Zhi LI , Hua WU
IPC: G06F40/232 , G06F40/279 , G06F40/53
Abstract: A text processing method is provided. The method includes: a first probability value of each candidate character of a plurality of candidate characters corresponding to a target position is determined based on character feature information corresponding to the target position in a text fragment to be processed, wherein the character feature information is determined based on a context at the target position in the text fragment to be processed; a second probability value of each candidate character of the plurality of candidate characters is determined based on a character string including the candidate character and at least one character in at least one position in the text fragment to be processed adjacent to the target position; and a correction character at the target position is determined based on the first probability value and the second probability value of each candidate character of the plurality of candidate characters.
-
公开(公告)号:US20230095352A1
公开(公告)日:2023-03-30
申请号:US18074853
申请日:2022-02-05
Inventor: Ruiqing ZHANG , Hui LIU , Zhongjun HE , Zhi LI , Hua WU
IPC: G06N3/0455 , G06F40/44 , G06F40/58 , G06N3/08 , G06N3/042
Abstract: The present disclosure provides a translation method and apparatus, an electronic device, and a non-transitory storage medium. An implementation includes: determining an encoded feature of a sentence to be translated by an encoding module; determining, by a graph network module, a knowledge fusion feature of the sentence to be translated based on a preset graph network, wherein the preset graph network is constructed based on a polysemous word in a source language corresponding to the sentence to be translated and a plurality of translated words corresponding to the polysemous word in a target language; determining, by a decoding network, a translated sentence corresponding to the sentence to be translated based on the encoded feature and the knowledge fusion feature.
-
公开(公告)号:US20230090625A1
公开(公告)日:2023-03-23
申请号:US18053034
申请日:2022-11-07
Inventor: Ruiqing ZHANG , Zhongjun HE , Hua WU
IPC: G06F40/279 , G06F40/166
Abstract: Disclosed are a method for correcting a text, an electronic device and a storage medium. The method includes: acquiring a text to be corrected; acquiring a phonetic symbol sequence of the text to be corrected; and obtaining a corrected text by inputting the text to be corrected and the phonetic symbol sequence into a text correction model, in which, the text correction model obtains the corrected text by: detecting an error word in the text to be corrected, determining a phonetic symbol corresponding to the error word in the phonetic symbol sequence, and adding the phonetic feature corresponding to the phonetic symbol behind the error word to obtain a phonetic symbol text, and correcting the error word and the phonetic feature in the phonetic symbol text to obtain the corrected text.
-
公开(公告)号:US20230088360A1
公开(公告)日:2023-03-23
申请号:US18059389
申请日:2022-11-28
Inventor: Pengzhi GAO , Zhongjun HE , Zhi LI , Hua WU
IPC: G06F40/40 , G06F40/166 , G06N20/00
Abstract: A method of training a deep learning model is provided, which relates to a field of artificial intelligence, in particular to a field of a natural language processing technology and a field of a machine translation technology. A specific implementation solution includes: processing sample source data and corresponding sample target data respectively by using the deep learning model, so as to obtain a first output value and a second output value; determining a regularization function value according to the first output value and the second output value; and adjusting a parameter of the depth learning model according to the regularization function value, so as to obtain a pre-trained depth learning model. A method of processing text data, an electronic device, and a storage medium are further provided.
-
公开(公告)号:US20230342561A1
公开(公告)日:2023-10-26
申请号:US18122316
申请日:2023-03-16
Inventor: Ruiqing ZHANG , Hui LIU , Zhongjun HE , Zhi LI , Hua WU
Abstract: A machine translation method includes: obtaining first target language text by performing first translation on source language text using an initial NMT model; identifying an untranslated part in the source language text based on the source language text and the first target language text; obtaining an adjusted NMT model by increasing an attention weight corresponding to the untranslated part in the initial NMT mode; and obtaining second target language text by performing second translation on the source language text using the adjusted NMT model.
-
公开(公告)号:US20230153550A1
公开(公告)日:2023-05-18
申请号:US18096297
申请日:2023-01-12
Inventor: Liwen ZHANG , Meng SUN , Zhi LI , Zhongjun HE
Abstract: A machine translation method can include: acquiring a to-be-translated source text; generating an intervention text corresponding to the to-be-translated source text by using intervention symbols, the intervention text including a term vocabulary part and an other text part; translating the intervention text to obtain a first translation result of the intervention text, where the first translation result includes a translation result of the other text part and the term vocabulary part; and generating a target translated text of the to-be-translated source text based on the first translation result and preset translated content of the term vocabulary part.
-
-
-
-
-
-
-
-
-