Invention Grant
- Patent Title: Methods, computing devices, and storage media for generating training corpus
-
Application No.: US16810070Application Date: 2020-03-05
-
Publication No.: US11348571B2Publication Date: 2022-05-31
- Inventor: Shiqiang Ding , Jizhou Huang , Zhongwei Jiang , Wentao Ma
- Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
- Applicant Address: CN Beijing
- Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
- Current Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
- Current Assignee Address: CN Beijing
- Agency: Lathrop GPM LLP
- Priority: CN201910179796.4 20190311
- Main IPC: G10L15/22
- IPC: G10L15/22 ; G10L15/06 ; G06N20/00 ; G06N5/04 ; G10L25/63

Abstract:
The present disclosure provides methods, computing devices, and storage media for generating a training corpus. The method includes: mining out pieces of data from user behavior logs associated with a target application, each piece of data including a first behavior log and a second behavior log, the first behavior log including a user speech and a corresponding speech recognition result, the second behavior log belonging to the same user as the first behavior log and time-dependent with the first behavior log; and determining the user speech and the corresponding speech recognition result in each piece of data as a positive feedback sample or a negative feedback sample, based on the first behavior log and the second behavior log.
Public/Granted literature
- US20200294489A1 METHODS, COMPUTING DEVICES, AND STORAGE MEDIA FOR GENERATING TRAINING CORPUS Public/Granted day:2020-09-17
Information query