发明授权
US07945579B1 Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems
有权
在基于关键字的检索系统中找到有意义的词汇或停止词组
- 专利标题: Locating meaningful stopwords or stop-phrases in keyword-based retrieval systems
- 专利标题(中): 在基于关键字的检索系统中找到有意义的词汇或停止词组
-
申请号: US12185651申请日: 2008-08-04
-
公开(公告)号: US07945579B1公开(公告)日: 2011-05-17
- 发明人: Simon Tong , Uri Lerner , Amit Singhal , Paul Haahr , Steven Baker
- 申请人: Simon Tong , Uri Lerner , Amit Singhal , Paul Haahr , Steven Baker
- 申请人地址: US CA Mountain View
- 专利权人: Google Inc.
- 当前专利权人: Google Inc.
- 当前专利权人地址: US CA Mountain View
- 代理机构: Harrity & Harrity, LLP
- 主分类号: G06F17/30
- IPC分类号: G06F17/30 ; G06F7/00
摘要:
A stopword detection component detects stopwords (also stop-phrases) in search queries input to keyword-based information retrieval systems. Potential stopwords are initially identified by comparing the terms in the search query to a list of known stopwords. Context data is then retrieved based on the search query and the identified stopwords. In one implementation, the context data includes documents retrieved from a document index. In another implementation, the context data includes categories relevant to the search query. Sets of retrieved context data are compared to one another to determine if they are substantially similar. If the sets of context data are substantially similar, this fact may be used to infer that the removal of the potential stopword(s) is not material to the search. If the sets of context data are not substantially similar, the potential stopword can be considered material to the search and should not be removed from the query.
信息查询