-
公开(公告)号:US11704498B2
公开(公告)日:2023-07-18
申请号:US17200551
申请日:2021-03-12
发明人: Ruiqing Zhang , Chuanqiang Zhang , Zhongjun He , Zhi Li , Hua Wu
IPC分类号: G06F40/30 , G06F40/51 , G06F40/44 , G06F40/49 , G06F18/214
CPC分类号: G06F40/30 , G06F18/214 , G06F40/44 , G06F40/49 , G06F40/51
摘要: A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second training sample set.
-
公开(公告)号:US11328133B2
公开(公告)日:2022-05-10
申请号:US16585269
申请日:2019-09-27
发明人: Hao Xiong , Zhongjun He , Xiaoguang Hu , Hua Wu , Zhi Li , Zhou Xin , Tian Wu , Haifeng Wang
摘要: The present disclosure provides a translation processing method, a translation processing device, and a device. The first speech signal of the first language is obtained, and the speech feature vector of the first speech signal is extracted based on the preset algorithm. Further, the speech feature vector is input into the pre-trained end-to-end translation model for conversion from the first language speech to the second language text for processing, and the text information of the second language corresponding to the first speech signal is obtained. Moreover, speech synthesis is performed on the text information of the second language, and the corresponding second speech signal is obtained and played.
-
公开(公告)号:US11132518B2
公开(公告)日:2021-09-28
申请号:US16691111
申请日:2019-11-21
发明人: Chuanqiang Zhang , Tianchi Bi , Hao Xiong , Zhi Li , Zhongjun He , Haifeng Wang
摘要: A method and apparatus for translating speech are provided. The method may include: recognizing received to-be-recognized speech of a source language to obtain a recognized text; concatenating the obtained recognized text after a to-be-translated text, to form a concatenated to-be-translated text; inputting the concatenated to-be-translated text into a pre-trained discriminant model to obtain a discrimination result for characterizing whether the concatenated to-be-translated text is to be translated, where the discriminant model is used to characterize a corresponding relationship between a text and a discrimination result corresponding to the text; in response to the positive discrimination result being obtained, translating the concatenated to-be-translated text to obtain a translation result of a target language, and outputting the translation result.
-
公开(公告)号:US11366973B2
公开(公告)日:2022-06-21
申请号:US16691104
申请日:2019-11-21
发明人: Jingwei Wang , Ao Zhang , Jiaxiang Liu , Yu Sun , Zhi Li
IPC分类号: G06F40/35 , G06F40/186 , G06F40/289
摘要: Embodiments of the present disclosure disclose a method and apparatus for determining a topic. A specific embodiment of the method comprises: determining a to-be-recognized sentence sequence; calculating similarities between the to-be-recognized sentence sequence and each of topic templates in a topic template set in a target area, the each of the topic templates in the topic template set corresponding to a topic in at least one topic in the target area, the topic template including a topic section sequence, and a topic section including a topic sentence sequence; and determining a topic of the to-be-recognized sentence sequence according to an associated parameter, the associated parameter including the similarities between the to-be-recognized sentence sequence and the each of the topic templates in the topic template set. This embodiment reduces labor costs during a topic segmentation.
-
公开(公告)号:US11275904B2
公开(公告)日:2022-03-15
申请号:US16868426
申请日:2020-05-06
发明人: Ruiqing Zhang , Chuanqiang Zhang , Hao Xiong , Zhongjun He , Hua Wu , Zhi Li , Haifeng Wang
摘要: Embodiments of the present disclosure provide a method and an apparatus for translating a polysemy, and a medium. The method includes: obtaining a source language text; identifying and obtaining the polysemy from the source language text; inquiring related words corresponding to each interpretation of the polysemy; determining a target interpretation corresponding to the related words contained in the source language text; and translating the polysemy into the target interpretation.
-
公开(公告)号:US20210390266A1
公开(公告)日:2021-12-16
申请号:US17200551
申请日:2021-03-12
发明人: Ruiqing Zhang , Chuanqiang Zhang , Zhongjun He , Zhi Li , Hua Wu
摘要: A method and apparatus for training models in machine translation, an electronic device and a storage medium are disclosed, which relates to the field of natural language processing technologies and the field of deep learning technologies. An implementation includes mining similar target sentences of a group of samples based on a parallel corpus using a machine translation model and a semantic similarity model, and creating a first training sample set; training the machine translation model with the first training sample set; mining a negative sample of each sample in the group of samples based on the parallel corpus using the machine translation model and the semantic similarity model, and creating a second training sample set; and training the semantic similarity model with the second sample training set. With the above-mentioned technical solution of the present application, by training the two models jointly, while the semantic similarity model is trained, the machine translation model may be optimized and nurtures the semantic similarity model, thus further improving the accuracy of the semantic similarity model.
-
公开(公告)号:US20200210522A1
公开(公告)日:2020-07-02
申请号:US16691104
申请日:2019-11-21
发明人: Jingwei Wang , Ao Zhang , Jiaxiang Liu , Yu Sun , Zhi Li
摘要: Embodiments of the present disclosure disclose a method and apparatus for determining a topic. A specific embodiment of the method comprises: determining a to-be-recognized sentence sequence; calculating similarities between the to-be-recognized sentence sequence and each of topic templates in a topic template set in a target area, the each of the topic templates in the topic template set corresponding to a topic in at least one topic in the target area, the topic template including a topic section sequence, and a topic section including a topic sentence sequence; and determining a topic of the to-be-recognized sentence sequence according to an associated parameter, the associated parameter including the similarities between the to-be-recognized sentence sequence and the each of the topic templates in the topic template set. This embodiment reduces labor costs during a topic segmentation.
-
公开(公告)号:US11314946B2
公开(公告)日:2022-04-26
申请号:US16701382
申请日:2019-12-03
发明人: Hao Xiong , Zhongjun He , Zhi Li , Zhou Xin , Haifeng Wang
摘要: Embodiments of the present disclosure disclose a text translation method, a text translation apparatus, a device and a storage medium. The method includes: obtaining a source language text; and translating the source language text with a modified translation model to obtain a target language text corresponding to the source language text, the modified translation model being obtained by modifying an original translation model based on a text evaluation result of one or more translated texts for training, the translated text for training being an output result after translating through the original translation model, and the text evaluation result for evaluating a contextual semantic relation in the translated text for training.
-
公开(公告)号:US11423222B2
公开(公告)日:2022-08-23
申请号:US17243097
申请日:2021-04-28
发明人: Ruiqing Zhang , Chuanqiang Zhang , Zhongjun He , Zhi Li , Hua Wu
IPC分类号: G06F40/232 , G06N20/00 , G06F40/279 , G06F40/166
摘要: A method for text error correction includes: obtaining a text to be corrected; obtaining a pinyin sequence of the text to be corrected; and inputting the text to be corrected and the pinyin sequence to a text error correction model, to obtain a corrected text.
-
10.
公开(公告)号:US11182648B2
公开(公告)日:2021-11-23
申请号:US16901940
申请日:2020-06-15
发明人: Hao Xiong , Zhongjun He , Zhi Li , Hua Wu , Haifeng Wang
IPC分类号: G06K9/62 , G06F40/117 , G06N20/00
摘要: The present disclosure provides an end-to-end model training method and apparatus, which relates to a field of artificial intelligence technologies. The method includes: obtaining training data containing a plurality of training samples, in which the plurality of training samples include an original sequence, a target sequence and a corresponding tag list, the tag list includes importance tags in the target sequence and avoidance tags corresponding to the importance tags, and the avoidance tags are irrelevant tags corresponding to the importance tags; and adopting the training data to train a preset end-to-end model until a value of a preset optimization target function is smaller than a preset threshold.
-
-
-
-
-
-
-
-
-