Invention Grant
US09280969B2 Model training for automatic speech recognition from imperfect transcription data 有权
从不完美的转录数据自动语音识别的模型训练

Model training for automatic speech recognition from imperfect transcription data
Abstract:
Techniques and systems for training an acoustic model are described. In an embodiment, a technique for training an acoustic model includes dividing a corpus of training data that includes transcription errors into N parts, and on each part, decoding an utterance with an incremental acoustic model and an incremental language model to produce a decoded transcription. The technique may further include inserting silence between a pair of words into the decoded transcription and aligning an original transcription corresponding to the utterance with the decoded transcription according to time for each part. The technique may further include selecting a segment from the utterance having at least Q contiguous matching aligned words, and training the incremental acoustic model with the selected segment. The trained incremental acoustic model may then be used on a subsequent part of the training data. Other embodiments are described and claimed.
Information query
Patent Agency Ranking
0/0