Patent search ap:("BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO. Page LTD.") AND inv:"Zhongjun He"

1.

发明授权
Text translation method, device, and storage medium 有权

公开(公告)号：US11314946B2

公开(公告)日：2022-04-26

申请号：US16701382

申请日：2019-12-03

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Hao Xiong , Zhongjun He , Zhi Li , Zhou Xin , Haifeng Wang

IPC: G06F40/30 , G06F40/51 , G06F40/58 , G06N3/04 , G06N3/08

Abstract: Embodiments of the present disclosure disclose a text translation method, a text translation apparatus, a device and a storage medium. The method includes: obtaining a source language text; and translating the source language text with a modified translation model to obtain a target language text corresponding to the source language text, the modified translation model being obtained by modifying an original translation model based on a text evaluation result of one or more translated texts for training, the translated text for training being an output result after translating through the original translation model, and the text evaluation result for evaluating a contextual semantic relation in the translated text for training.

2.

发明授权
Language conversion method and apparatus based on syntactic linearity, and non-transitory computer-readable storage medium 有权

公开(公告)号：US11409968B2

公开(公告)日：2022-08-09

申请号：US16926197

申请日：2020-07-10

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Ruiqing Zhang , Chuanqiang Zhang , Hao Xiong , Zhongjun He , Hua Wu , Haifeng Wang

IPC: G06F40/55 , G06F40/211 , G06F40/58

Abstract: Embodiments of the present disclosure provide a language conversion method and apparatus based on syntactic linearity and a non-transitory computer-readable storage medium. The method includes: encoding a source sentence to be converted by using a preset encoder to determine a first vector and a second vector corresponding to the source sentence; determining a current mask vector according to a preset rule, in which the mask vector is configured to modify vectors output by the preset encoder; determining a third vector according to target language characters corresponding to source characters located before a first source character; and decoding the first vector, the second vector, the mask vector, and the third vector by using a preset decoder to generate a target character corresponding to the first source character.

3.

发明申请
METHOD AND APPARATUS FOR TRAINING MODELS IN MACHINE TRANSLATION, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20210390266A1

公开(公告)日：2021-12-16

申请号：US17200551

申请日：2021-03-12

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Ruiqing Zhang , Chuanqiang Zhang , Zhongjun He , Zhi Li , Hua Wu

IPC: G06F40/51 , G06F40/49 , G06K9/62 , G06F40/30 , G06F40/44

Abstract: A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second sample training set. With the above-mentioned technical solution of the present application, by training the two models jointly, while the semantic similarity model is trained, the machine translation model may be optimized and nurtures the semantic similarity model, thus further improving the accuracy of the semantic similarity model.

4.

发明授权
Method and apparatus for training models in machine translation, electronic device and storage medium 有权

公开(公告)号：US11704498B2

公开(公告)日：2023-07-18

申请号：US17200551

申请日：2021-03-12

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Ruiqing Zhang , Chuanqiang Zhang , Zhongjun He , Zhi Li , Hua Wu

IPC: G06F40/30 , G06F40/51 , G06F40/44 , G06F40/49 , G06F18/214

CPC classification number: G06F40/30 , G06F18/214 , G06F40/44 , G06F40/49 , G06F40/51

Abstract: A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second training sample set.

5.

发明授权
Translation processing method, translation processing device, and device 有权

公开(公告)号：US11328133B2

公开(公告)日：2022-05-10

申请号：US16585269

申请日：2019-09-27

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Hao Xiong , Zhongjun He , Xiaoguang Hu , Hua Wu , Zhi Li , Zhou Xin , Tian Wu , Haifeng Wang

IPC: G06F40/58 , G06N20/00 , G10L15/22 , G10L25/24 , G10L13/00 , G10L15/26

Abstract: The present disclosure provides a translation processing method, a translation processing device, and a device. The first speech signal of the first language is obtained, and the speech feature vector of the first speech signal is extracted based on the preset algorithm. Further, the speech feature vector is input into the pre-trained end-to-end translation model for conversion from the first language speech to the second language text for processing, and the text information of the second language corresponding to the first speech signal is obtained. Moreover, speech synthesis is performed on the text information of the second language, and the corresponding second speech signal is obtained and played.

6.

发明授权
Method and apparatus for translating speech 有权

公开(公告)号：US11132518B2

公开(公告)日：2021-09-28

申请号：US16691111

申请日：2019-11-21

Applicant: Beijing Baidu Netcom Science And Technology Co., LTD.

Inventor： Chuanqiang Zhang , Tianchi Bi , Hao Xiong , Zhi Li , Zhongjun He , Haifeng Wang

IPC: G06F40/58 , G06N3/04 , G06N3/08 , G10L15/06 , G10L15/16 , G10L15/22 , G10L15/30

Abstract: A method and apparatus for translating speech are provided. The method may include: recognizing received to-be-recognized speech of a source language to obtain a recognized text; concatenating the obtained recognized text after a to-be-translated text, to form a concatenated to-be-translated text; inputting the concatenated to-be-translated text into a pre-trained discriminant model to obtain a discrimination result for characterizing whether the concatenated to-be-translated text is to be translated, where the discriminant model is used to characterize a corresponding relationship between a text and a discrimination result corresponding to the text; in response to the positive discrimination result being obtained, translating the concatenated to-be-translated text to obtain a translation result of a target language, and outputting the translation result.

7.

发明授权
Method and apparatus for translating polysemy, and medium 有权

公开(公告)号：US11275904B2

公开(公告)日：2022-03-15

申请号：US16868426

申请日：2020-05-06

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Ruiqing Zhang , Chuanqiang Zhang , Hao Xiong , Zhongjun He , Hua Wu , Zhi Li , Haifeng Wang

IPC: G06F17/00 , G06F40/40

Abstract: Embodiments of the present disclosure provide a method and an apparatus for translating a polysemy, and a medium. The method includes: obtaining a source language text; identifying and obtaining the polysemy from the source language text; inquiring related words corresponding to each interpretation of the polysemy; determining a target interpretation corresponding to the related words contained in the source language text; and translating the polysemy into the target interpretation.

8.

发明授权
Method and apparatus for translating based on artificial intelligence 有权

公开(公告)号：US10467349B2

公开(公告)日：2019-11-05

申请号：US15832013

申请日：2017-12-05

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Zhongjun He , Hongyu Liu , Shiqi Zhao , Hua Wu

IPC: G06F17/00 , G06F3/00 , G06F17/28 , G06N3/02 , G06N3/04 , G06N3/08

Abstract: The resent disclosure provides a method and an apparatus for translating based on artificial intelligence. With the method, the text to be translated from the source language to the target language is acquired, in which, the text includes the target language term and the source language term. The candidate terms for translating the source language term and confidences of the candidate terms are determined. The candidate terms are used to replace the corresponding source language term, and each candidate term is combined with the target language term, so as to obtain each candidate translation. A probability of forming a smooth text when the candidate term is used in the candidate translation is predicted. Then the target term is chosen to be recommended according to the language probabilities of the candidate translations and the confidences of the candidate terms.

9.

发明授权
Method and device for expanding data of bilingual corpus, and storage medium 有权

公开(公告)号：US09953024B2

公开(公告)日：2018-04-24

申请号：US14892933

申请日：2014-09-04

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xiaoning Zhu , Zhongjun He , Hua Wu , Haifeng Wang

IPC: G06F17/21 , G06F17/27 , G06F17/28 , G06F17/20 , G10L21/00 , G06F17/30

CPC classification number: G06F17/2735 , G06F17/2827 , G06F17/2845 , G06F17/3043 , G06F17/30489 , G06F17/30654 , G06F17/30669

Abstract: Disclosed are a method and a device for expanding data of a bilingual corpus. The method for expanding data of a bilingual corpus includes: searching, in a source language-pivot language corpus, for at least one first pivot language phrase semantically matching a first source language phrase; searching, in the source language-pivot language corpus, for at least one second source language phrase semantically matching each of the first pivot language phrases to form a source language phrase set by the second source language phrases; searching, in a pivot language-target language corpus, for at least one first target language phrase semantically matching each of the first pivot language phrases to form a target language phrase set by the first target language phrases; combining the second source language phrases in the source language phrase set with the first target language phrases in the target language phrase set, so as to form at least one phrase pair in which a source language phrase and a target language phrase semantically match; and storing the formed at least one phrase pair in which the source language phrase and the target language phrase semantically match into a source language-target language corpus. Data in a bilingual corpus is expanded, so that the problem of data sparseness in the bilingual corpus is solved.

10.

发明申请
METHOD AND DEVICE FOR EXPANDING DATA OF BILINGUAL CORPUS, AND STORAGE MEDIUM 有权
Title translation: 用于扩展双胞胎数据的方法和装置以及存储介质

公开(公告)号：US20160239481A1

公开(公告)日：2016-08-18

申请号：US14892933

申请日：2014-09-04

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xiaoning Zhu , Zhongjun He , Hua Wu , Haifeng Wang

IPC: G06F17/27 , G06F17/30 , G06F17/28

CPC classification number: G06F17/2735 , G06F17/2827 , G06F17/2845 , G06F17/3043 , G06F17/30489 , G06F17/30654 , G06F17/30669

Abstract: Disclosed are a method and a device for expanding data of a bilingual corpus. The method for expanding data of a bilingual corpus includes: searching, in a source language-pivot language corpus, for at least one first pivot language phrase semantically matching a first source language phrase; searching, in the source language-pivot language corpus, for at least one second source language phrase semantically matching each of the first pivot language phrases to form a source language phrase set by the second source language phrases; searching, in a pivot language-target language corpus, for at least one first target language phrase semantically matching each of the first pivot language phrases to form a target language phrase set by the first target language phrases; combining the second source language phrases in the source language phrase set with the first target language phrases in the target language phrase set, so as to form at least one phrase pair in which a source language phrase and a target language phrase semantically match; and storing the formed at least one phrase pair in which the source language phrase and the target language phrase semantically match into a source language-target language corpus. Data in a bilingual corpus is expanded, so that the problem of data sparseness in the bilingual corpus is solved.

Abstract translation: 公开了一种用于扩展双语语料库数据的方法和装置。用于扩展双语语料库的数据的方法包括：在源语言 - 枢轴语言语料库中搜索语义上匹配第一源语言短语的至少一个第一枢轴语言短语; 在源语言 - 枢轴语言语料库中搜索至少一个第二源语言短语，语义上匹配每个第一枢轴语言短语以形成由第二源语言短语设置的源语言短语; 在枢轴语言目标语言语料库中搜索至少一个第一目标语言短语，语义上匹配每个第一枢轴语言短语以形成由第一目标语言短语设置的目标语言短语; 将源语言短语集合中的第二源语言短语与目标语言短语集合中的第一目标语言短语组合，以形成源语言短语和目标语言短语在语义上匹配的至少一个短语对; 并且将所形成的至少一个短语对存储在源语言短语和目标语言短语语义匹配中到源语言目标语言语料库中。双语语料库中的数据扩展，双语语料库数据稀疏问题得到解决。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification