Patent search ap:("Beijing Baidu Netcom Science AND Technology Co. Page Ltd.") AND inv:"Zhi Li"

1.

发明授权
Method and apparatus for training models in machine translation, electronic device and storage medium 有权

公开(公告)号：US11704498B2

公开(公告)日：2023-07-18

申请号：US17200551

申请日：2021-03-12

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Ruiqing Zhang , Chuanqiang Zhang , Zhongjun He , Zhi Li , Hua Wu

IPC: G06F40/30 , G06F40/51 , G06F40/44 , G06F40/49 , G06F18/214

CPC classification number: G06F40/30 , G06F18/214 , G06F40/44 , G06F40/49 , G06F40/51

Abstract: A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second training sample set.

2.

发明授权
Translation processing method, translation processing device, and device 有权

公开(公告)号：US11328133B2

公开(公告)日：2022-05-10

申请号：US16585269

申请日：2019-09-27

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Hao Xiong , Zhongjun He , Xiaoguang Hu , Hua Wu , Zhi Li , Zhou Xin , Tian Wu , Haifeng Wang

IPC: G06F40/58 , G06N20/00 , G10L15/22 , G10L25/24 , G10L13/00 , G10L15/26

Abstract: The present disclosure provides a translation processing method, a translation processing device, and a device. The first speech signal of the first language is obtained, and the speech feature vector of the first speech signal is extracted based on the preset algorithm. Further, the speech feature vector is input into the pre-trained end-to-end translation model for conversion from the first language speech to the second language text for processing, and the text information of the second language corresponding to the first speech signal is obtained. Moreover, speech synthesis is performed on the text information of the second language, and the corresponding second speech signal is obtained and played.

3.

发明授权
Method and apparatus for translating speech 有权

公开(公告)号：US11132518B2

公开(公告)日：2021-09-28

申请号：US16691111

申请日：2019-11-21

Applicant: Beijing Baidu Netcom Science And Technology Co., LTD.

Inventor： Chuanqiang Zhang , Tianchi Bi , Hao Xiong , Zhi Li , Zhongjun He , Haifeng Wang

IPC: G06F40/58 , G06N3/04 , G06N3/08 , G10L15/06 , G10L15/16 , G10L15/22 , G10L15/30

Abstract: A method and apparatus for translating speech are provided. The method may include: recognizing received to-be-recognized speech of a source language to obtain a recognized text; concatenating the obtained recognized text after a to-be-translated text, to form a concatenated to-be-translated text; inputting the concatenated to-be-translated text into a pre-trained discriminant model to obtain a discrimination result for characterizing whether the concatenated to-be-translated text is to be translated, where the discriminant model is used to characterize a corresponding relationship between a text and a discrimination result corresponding to the text; in response to the positive discrimination result being obtained, translating the concatenated to-be-translated text to obtain a translation result of a target language, and outputting the translation result.

4.

发明申请
METHOD AND APPARATUS FOR TRAINING MODELS IN MACHINE TRANSLATION, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20210390266A1

公开(公告)日：2021-12-16

申请号：US17200551

申请日：2021-03-12

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Ruiqing Zhang , Chuanqiang Zhang , Zhongjun He , Zhi Li , Hua Wu

IPC: G06F40/51 , G06F40/49 , G06K9/62 , G06F40/30 , G06F40/44

Abstract: A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second sample training set. With the above-mentioned technical solution of the present application, by training the two models jointly, while the semantic similarity model is trained, the machine translation model may be optimized and nurtures the semantic similarity model, thus further improving the accuracy of the semantic similarity model.

5.

发明申请
METHOD AND APPARATUS FOR DETERMINING A TOPIC 审中-公开

公开(公告)号：US20200210522A1

公开(公告)日：2020-07-02

申请号：US16691104

申请日：2019-11-21

Applicant: Beijing Baidu Netcom Science And Technology Co., LTD.

Inventor： Jingwei Wang , Ao Zhang , Jiaxiang Liu , Yu Sun , Zhi Li

IPC: G06F17/27 , G06F17/24

Abstract: Embodiments of the present disclosure disclose a method and apparatus for determining a topic. A specific embodiment of the method comprises: determining a to-be-recognized sentence sequence; calculating similarities between the to-be-recognized sentence sequence and each of topic templates in a topic template set in a target area, the each of the topic templates in the topic template set corresponding to a topic in at least one topic in the target area, the topic template including a topic section sequence, and a topic section including a topic sentence sequence; and determining a topic of the to-be-recognized sentence sequence according to an associated parameter, the associated parameter including the similarities between the to-be-recognized sentence sequence and the each of the topic templates in the topic template set. This embodiment reduces labor costs during a topic segmentation.

6.

发明授权
Text translation method, device, and storage medium 有权

公开(公告)号：US11314946B2

公开(公告)日：2022-04-26

申请号：US16701382

申请日：2019-12-03

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Hao Xiong , Zhongjun He , Zhi Li , Zhou Xin , Haifeng Wang

IPC: G06F40/30 , G06F40/51 , G06F40/58 , G06N3/04 , G06N3/08

Abstract: Embodiments of the present disclosure disclose a text translation method, a text translation apparatus, a device and a storage medium. The method includes: obtaining a source language text; and translating the source language text with a modified translation model to obtain a target language text corresponding to the source language text, the modified translation model being obtained by modifying an original translation model based on a text evaluation result of one or more translated texts for training, the translated text for training being an output result after translating through the original translation model, and the text evaluation result for evaluating a contextual semantic relation in the translated text for training.

7.

发明授权
Method and apparatus for determining a topic 有权

公开(公告)号：US11366973B2

公开(公告)日：2022-06-21

申请号：US16691104

申请日：2019-11-21

Applicant: Beijing Baidu Netcom Science And Technology Co., LTD.

Inventor： Jingwei Wang , Ao Zhang , Jiaxiang Liu , Yu Sun , Zhi Li

IPC: G06F40/35 , G06F40/186 , G06F40/289

Abstract: Embodiments of the present disclosure disclose a method and apparatus for determining a topic. A specific embodiment of the method comprises: determining a to-be-recognized sentence sequence; calculating similarities between the to-be-recognized sentence sequence and each of topic templates in a topic template set in a target area, the each of the topic templates in the topic template set corresponding to a topic in at least one topic in the target area, the topic template including a topic section sequence, and a topic section including a topic sentence sequence; and determining a topic of the to-be-recognized sentence sequence according to an associated parameter, the associated parameter including the similarities between the to-be-recognized sentence sequence and the each of the topic templates in the topic template set. This embodiment reduces labor costs during a topic segmentation.

8.

发明授权
Method and apparatus for translating polysemy, and medium 有权

公开(公告)号：US11275904B2

公开(公告)日：2022-03-15

申请号：US16868426

申请日：2020-05-06

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Ruiqing Zhang , Chuanqiang Zhang , Hao Xiong , Zhongjun He , Hua Wu , Zhi Li , Haifeng Wang

IPC: G06F17/00 , G06F40/40

Abstract: Embodiments of the present disclosure provide a method and an apparatus for translating a polysemy, and a medium. The method includes: obtaining a source language text; identifying and obtaining the polysemy from the source language text; inquiring related words corresponding to each interpretation of the polysemy; determining a target interpretation corresponding to the related words contained in the source language text; and translating the polysemy into the target interpretation.

9.

发明授权
Method and apparatus for text error correction, electronic device and storage medium 有权

公开(公告)号：US11423222B2

公开(公告)日：2022-08-23

申请号：US17243097

申请日：2021-04-28

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Ruiqing Zhang , Chuanqiang Zhang , Zhongjun He , Zhi Li , Hua Wu

IPC: G06F40/232 , G06N20/00 , G06F40/279 , G06F40/166

Abstract: A method for text error correction includes: obtaining a text to be corrected; obtaining a pinyin sequence of the text to be corrected; and inputting the text to be corrected and the pinyin sequence to a text error correction model, to obtain a corrected text.

10.

发明授权
End-to-end model training method and apparatus, and non-transitory computer-readable medium 有权

公开(公告)号：US11182648B2

公开(公告)日：2021-11-23

申请号：US16901940

申请日：2020-06-15

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Hao Xiong , Zhongjun He , Zhi Li , Hua Wu , Haifeng Wang

IPC: G06K9/62 , G06F40/117 , G06N20/00

Abstract: The present disclosure provides an end-to-end model training method and apparatus, which relates to a field of artificial intelligence technologies. The method includes: obtaining training data containing a plurality of training samples, in which the plurality of training samples include an original sequence, a target sequence and a corresponding tag list, the tag list includes importance tags in the target sequence and avoidance tags corresponding to the importance tags, and the avoidance tags are irrelevant tags corresponding to the importance tags; and adopting the training data to train a preset end-to-end model until a value of a preset optimization target function is smaller than a preset threshold.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification