- 专利标题: Language model optimization for in-domain application
-
申请号: US14271962申请日: 2014-05-07
-
公开(公告)号: US09972311B2公开(公告)日: 2018-05-15
- 发明人: Michael Levit , Sarangarajan Parthasarathy , Andreas Stolcke
- 申请人: MICROSOFT CORPORATION
- 申请人地址: US WA Redmond
- 专利权人: Microsoft Technology Licensing, LLC
- 当前专利权人: Microsoft Technology Licensing, LLC
- 当前专利权人地址: US WA Redmond
- 代理机构: Shook, Hardy & Bacon L.L.P.
- 主分类号: G10L15/06
- IPC分类号: G10L15/06 ; G10L15/18 ; G06F17/27
摘要:
Systems and methods are provided for optimizing language models for in-domain applications through an iterative, joint-modeling approach that expresses training material as alternative representations of higher-level tokens, such as named entities and carrier phrases. From a first language model, an in-domain training corpus may be represented as a set of alternative parses of tokens. Statistical information determined from these parsed representations may be used to produce a second (or updated) language model, which is further optimized for the domain. The second language model may be used to determine another alternative parsed representation of the corpus for a next iteration, and the statistical information determined from this representation may be used to produce a third (or further updated) language model. Through each iteration, a language model may be determined that is further optimized for the domain.
公开/授权文献
- US20150325235A1 Language Model Optimization For In-Domain Application 公开/授权日:2015-11-12
信息查询