Invention Grant
US09280969B2 Model training for automatic speech recognition from imperfect transcription data
有权
从不完美的转录数据自动语音识别的模型训练
- Patent Title: Model training for automatic speech recognition from imperfect transcription data
- Patent Title (中): 从不完美的转录数据自动语音识别的模型训练
-
Application No.: US12482142Application Date: 2009-06-10
-
Publication No.: US09280969B2Publication Date: 2016-03-08
- Inventor: Jinyu Li , Yifan Gong , Chaojun Liu , Kaisheng Yao
- Applicant: Jinyu Li , Yifan Gong , Chaojun Liu , Kaisheng Yao
- Applicant Address: US WA Redmond
- Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
- Current Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
- Current Assignee Address: US WA Redmond
- Agent Steven Spellman; Fehmi Chebil; Micky Minhas
- Main IPC: G10L15/00
- IPC: G10L15/00 ; G10L15/06 ; G10L15/065

Abstract:
Techniques and systems for training an acoustic model are described. In an embodiment, a technique for training an acoustic model includes dividing a corpus of training data that includes transcription errors into N parts, and on each part, decoding an utterance with an incremental acoustic model and an incremental language model to produce a decoded transcription. The technique may further include inserting silence between a pair of words into the decoded transcription and aligning an original transcription corresponding to the utterance with the decoded transcription according to time for each part. The technique may further include selecting a segment from the utterance having at least Q contiguous matching aligned words, and training the incremental acoustic model with the selected segment. The trained incremental acoustic model may then be used on a subsequent part of the training data. Other embodiments are described and claimed.
Public/Granted literature
- US20100318355A1 MODEL TRAINING FOR AUTOMATIC SPEECH RECOGNITION FROM IMPERFECT TRANSCRIPTION DATA Public/Granted day:2010-12-16
Information query