发明授权
- 专利标题: System and method for performing electronic information retrieval using keywords
- 专利标题(中): 使用关键字执行电子信息检索的系统和方法
-
申请号: US10605630申请日: 2003-10-15
-
公开(公告)号: US07370034B2公开(公告)日: 2008-05-06
- 发明人: Alain Franciosa , Christopher R Dance
- 申请人: Alain Franciosa , Christopher R Dance
- 申请人地址: US CT Norwalk
- 专利权人: Xerox Corporation
- 当前专利权人: Xerox Corporation
- 当前专利权人地址: US CT Norwalk
- 代理机构: Fay Sharpe LLP
- 主分类号: G06F7/00
- IPC分类号: G06F7/00 ; G06F17/30
摘要:
Output documents similar to an input document are identified. A query is formulated using a list of best keywords from the input document to search for a first set of output documents. The list of best keywords is defined with a maximum number of keywords less than the total number of keywords in the list of best keywords that are identified as belonging to a domain specific dictionary of words and as having no measurable linguistic frequency. Lists of keywords are identified for each output document in the first set of documents. A second set of similar documents is determined using a measure of similarity that is computed between keywords identified in the input document and each output document in the first set of documents.
公开/授权文献
信息查询