Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Xiyang WANG"

1.

发明公开
Method and Apparatus for Selecting Sample Corpus Used to Optimize Translation Model 审中-公开

公开(公告)号：US20230140997A1

公开(公告)日：2023-05-11

申请号：US18089392

申请日：2022-12-27

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Ruiqing ZHANG , Xiyang WANG , Zhongjun HE , Zhi LI , Hua WU

IPC: G06F40/58

CPC classification number: G06F40/58

Abstract: A method and apparatus for selecting a sample corpus used to optimize a translation model, an electronic device, a computer readable storage medium, and a computer program product are provided. The method includes: after acquiring a first corpus, translating the first corpus by using a to-be-optimized translation model to acquire a second corpus with different types of languages, then translating the second corpus by using the to-be-optimized translation model to acquire a third corpus, then determining a difficulty level of the first corpus based on a similarity between the first corpus and the third corpus, and finally determining the first corpus as a sample corpus used to perform optimization training on the to-be-optimized translation model in response to the difficulty level satisfying requirements of a difficulty level threshold.

2.

发明公开
Method for Evaluating Text Content, and Related Apparatus 审中-公开

公开(公告)号：US20230196026A1

公开(公告)日：2023-06-22

申请号：US18109813

申请日：2023-02-14

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xiyang WANG , Ruiqing ZHANG , Zhongjun HE , Zhi LI , Hua WU

IPC: G06F40/30 , G06F40/289

CPC classification number: G06F40/30 , G06F40/289

Abstract: A method for evaluating a text content, which may include: after splitting a to-be-evaluated text into a plurality of clauses arranged in sequence according to punctuation information of the to-be-evaluated text, determining a first clause of the plurality of clauses as an actual tune name; then, determining actual prosodic information based on a Chinese phonetic alphabet text of a third clause to a last clause in response to that a number of clauses, whose numbers of Chinese characters satisfy character count requirements of clauses corresponding to the actual tune name, from the third clause to the last clause exceeds a number threshold; and finally, in response to the actual prosodic information being consistent with a standard prosodic information of the actual tune name, evaluating the to-be-evaluated text as a Ci-poetry text.

3.

发明公开
TRANSLATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20230153548A1

公开(公告)日：2023-05-18

申请号：US17885152

申请日：2022-08-10

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Ruiqing ZHANG , Xiyang WANG , Zhongjun HE , Zhi LI , Hua WU

IPC: G06F40/58

CPC classification number: G06F40/58

Abstract: A translation method, an electronic device and a storage medium, which relate to the field of artificial intelligence technologies, such as machine learning technologies, information processing technologies, are disclosed. An implementation includes: acquiring an intermediate translation result generated by each of multiple pre-trained translation models for a to-be-translated specified sentence in a same iteration of a translation process, so as to obtain multiple intermediate translation results; acquiring a co-occurrence word based on the multiple intermediate translation results; and acquiring a target translation result of the specified sentence based on the co-occurrence word.

4.

发明申请
TRAINING METHOD, TEXT TRANSLATION METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20230076471A1

公开(公告)日：2023-03-09

申请号：US17982965

申请日：2022-11-08

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Xiyang WANG , Ruiqing ZHANG , Zhongjun HE , Zhi LI , Hua WU

IPC: G06F40/58 , G06F40/51

Abstract: A training method, a text translation method, an electronic device, and a storage medium, which relate to a field of artificial intelligence, in particular to fields of natural language processing and deep learning technologies. A specific implementation solution includes: performing a feature extraction on source sample text data to obtain a sample feature vector sequence; obtaining a target sample feature vector according to the sample feature vector sequence; performing an autoregressive decoding and a non-autoregressive decoding on the sample feature vector sequence, respectively; performing a length prediction on the target sample feature vector; training a predetermined model by using translation sample data, the autoregressive text translation result, the non-autoregressive text translation result, a true length value of the source sample text, the first predicted length value, a true length value of the translation sample text, and the second predicted length value to obtain the text translation model.

5.

发明公开
TRANSLATION METHOD, MODEL TRAINING METHOD, ELECTRONIC DEVICES AND STORAGE MEDIUMS 审中-公开

公开(公告)号：US20230153543A1

公开(公告)日：2023-05-18

申请号：US17951216

申请日：2022-09-23

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Ruiqing ZHANG , Xiyang WANG , Hui LIU , Zhongjun HE , Zhi LI , Hua WU

IPC: G06F40/51 , G06F40/42

CPC classification number: G06F40/51 , G06F40/42

Abstract: A translation method, a model training method, apparatuses, electronic devices and storage mediums, which relate to the field of artificial intelligence technologies, such as machine learning technologies, information processing technologies, are disclosed. In an implementation, a weight for each translation model in at least two pre-trained translation models translating a to-be-translated specified sentence is acquired based on the specified sentence and a pre-trained weighting model; and the specified sentence is translating using the at least two translation models based on the weight for each translation model translating the specified sentence.

6.

发明申请
METHOD FOR TRAINING NON-AUTOREGRESSIVE TRANSLATION MODEL 有权

公开(公告)号：US20230051373A1

公开(公告)日：2023-02-16

申请号：US17974317

申请日：2022-10-26

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Xiyang WANG , Ruiqing ZHANG , Zhongjun HE , Zhi LI , Hua WU

IPC: G06F40/47

Abstract: A method for training a non-autoregressive translation (NAT) model includes: acquiring a source language text, a target language text corresponding to the source language text and a target length of the target language text; generating a target language prediction text and a prediction length by inputting the source language text into the NAT model, in which initialization parameters of the NAT model are determined based on parameters of a pre-trained translation model; and obtaining a target NAT model by training the NAT model based on the target language text, the target language prediction text, the target length and the prediction length.

Patent Agency Ranking