发明授权
- 专利标题: Domain dictionary creation by detection of new topic words using divergence value comparison
- 专利标题(中): 通过使用发散值比较检测新主题词来创建域名词典
-
申请号: US13158125申请日: 2011-06-10
-
公开(公告)号: US08386240B2公开(公告)日: 2013-02-26
- 发明人: Jun Wu , Tang Xi Liu , Feng Hong , Yong-Gang Wang , Bo Yang , Lei Zhang
- 申请人: Jun Wu , Tang Xi Liu , Feng Hong , Yong-Gang Wang , Bo Yang , Lei Zhang
- 申请人地址: US CA Mountain View
- 专利权人: Google Inc.
- 当前专利权人: Google Inc.
- 当前专利权人地址: US CA Mountain View
- 代理机构: Fish & Richardson P.C.
- 主分类号: G06F17/21
- IPC分类号: G06F17/21 ; G06F17/20 ; G06F17/27
摘要:
Methods, systems, and apparatus, including computer program products, to identify topic words in a collection of documents that includes topic documents related to a topic are disclosed. A reference topic word divergence value based on a document collection and the topic document collection is determined. A candidate topic word divergence value for a candidate topic word is determined based on the document collection and the topic document collection. The candidate topic word is determined to be a topic word if the candidate topic word divergence value is greater than the reference topic word divergence value.
公开/授权文献
- US20110238413A1 DOMAIN DICTIONARY CREATION 公开/授权日:2011-09-29
信息查询