发明授权
US08175878B1 Representing n-gram language models for compact storage and fast retrieval
有权
代表用于紧凑存储和快速检索的n-gram语言模型
- 专利标题: Representing n-gram language models for compact storage and fast retrieval
- 专利标题(中): 代表用于紧凑存储和快速检索的n-gram语言模型
-
申请号: US12968108申请日: 2010-12-14
-
公开(公告)号: US08175878B1公开(公告)日: 2012-05-08
- 发明人: Ciprian Chelba , Thorsten Brants
- 申请人: Ciprian Chelba , Thorsten Brants
- 申请人地址: US CA Mountain View
- 专利权人: Google Inc.
- 当前专利权人: Google Inc.
- 当前专利权人地址: US CA Mountain View
- 代理机构: Harness, Dickey & Pierce, P.L.C.
- 主分类号: G10L15/18
- IPC分类号: G10L15/18 ; G10L15/06 ; G06F17/27
摘要:
Systems, methods, and apparatuses, including computer program products, are provided for representing language models. In some implementations, a computer-implemented method is provided. The method includes generating a compact language model including receiving a collection of n-grams from the corpus, each n-gram of the collection having a corresponding first probability of occurring in the corpus and generating a trie representing the collection of n-grams. The method also includes using the language model to identify a second probability of a particular string of words occurring.