发明授权
US07966174B1 Automatic clustering of tokens from a corpus for grammar acquisition
有权
用于语法获取的语料库的令牌的自动聚类
- 专利标题: Automatic clustering of tokens from a corpus for grammar acquisition
- 专利标题(中): 用于语法获取的语料库的令牌的自动聚类
-
申请号: US12030935申请日: 2008-02-14
-
公开(公告)号: US07966174B1公开(公告)日: 2011-06-21
- 发明人: Srinivas Bangalore , Giuseppe Riccardi
- 申请人: Srinivas Bangalore , Giuseppe Riccardi
- 申请人地址: US GA Atlanta
- 专利权人: AT&T Intellectual Property II, L.P.
- 当前专利权人: AT&T Intellectual Property II, L.P.
- 当前专利权人地址: US GA Atlanta
- 主分类号: G06F17/27
- IPC分类号: G06F17/27
摘要:
A system for recognizing patterns is disclosed. Grammar learning from a corpus includes, for the other non-context words, generating frequency vectors for each non-context token in a corpus based upon counted occurrences of a predetermined relationship of the non-context tokens to identified context tokens. Clusters are grown from the frequency vectors according to a lexical correlation or a cluster tree among the non-context tokens. The cluster tree is used for pattern recognition.
信息查询