- Patent Title: Method and apparatus for generating parallel text in same language
-
Application No.: US15900166Application Date: 2018-02-20
-
Publication No.: US10650102B2Publication Date: 2020-05-12
- Inventor: Pengkai Li , Jingzhou He , Zhihong Fu , Xianwei Xin
- Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
- Applicant Address: CN Beijing
- Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
- Current Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
- Current Assignee Address: CN Beijing
- Agency: Seed IP Law Group LLP
- Priority: com.zzzhc.datahub.patent.etl.us.BibliographicData$PriorityClaim@7b713939
- Main IPC: G06F17/27
- IPC: G06F17/27 ; G06F17/28 ; G06N3/08 ; G06F16/951 ; G06N3/04 ; G06F16/33

Abstract:
The present disclosure discloses a method and apparatus for generating a parallel text in the same language. The method comprises: acquiring a source segmented word sequence and a pre-trained word vector table; determining a source word vector sequence corresponding to the source segmented word sequence, according to the word vector table; importing the source word vector sequence into a first pre-trained recurrent neural network model, to generate an intermediate vector of a preset dimension for characterizing semantics of the source segmented word sequence; importing the intermediate vector into a second pre-trained recurrent neural network model, to generate a target word vector sequence corresponding to the intermediate vector; and determining a target segmented word sequence corresponding to the target word vector sequence according to the word vector table, and determining the target segmented word sequence as a parallel text in the same language corresponding to the source segmented word sequence.
Public/Granted literature
- US20180365231A1 METHOD AND APPARATUS FOR GENERATING PARALLEL TEXT IN SAME LANGUAGE Public/Granted day:2018-12-20
Information query