Invention Grant
- Patent Title: Written-domain language modeling with decomposition
- Patent Title (中): 书面域语言建模与分解
-
Application No.: US13906654Application Date: 2013-05-31
-
Publication No.: US09460088B1Publication Date: 2016-10-04
- Inventor: Hasim Sak , Yun-hsuan Sung , Cyril Georges Luc Allauzen
- Applicant: Google Inc.
- Applicant Address: US CA Mountain View
- Assignee: Google Inc.
- Current Assignee: Google Inc.
- Current Assignee Address: US CA Mountain View
- Agency: Fish & Richardson P.C.
- Main IPC: G06F17/28
- IPC: G06F17/28 ; G06F17/27 ; G10L15/26 ; G10L15/28 ; G10L15/06 ; G10L15/14 ; G10L15/04 ; G10L19/00 ; G10L21/00 ; G10L25/00

Abstract:
An automatic speech recognition system and method are provided for written-domain language modeling. According to one implementation, a process includes accessing decomposed training data that results from applying rewrite grammar rules to original training data, the decomposed training data comprising (i) regular words from the original training data that have not been rewritten using the set of rewrite grammar rules, and (ii) decomposed segments that result from rewriting non-lexical entities from the original training data using the rewrite grammar rules, generating a restriction model that (i) maps language model paths for regular words to themselves, and (ii) restricts language model paths for decomposed segments for non-lexical entities, training a n-gram language model over the training data, composing the restriction model and the language model to obtain a restricted language model, and constructing a decoding network by composing a context dependency model and a pronunciation lexicon with the restricted language model.
Information query