发明授权
- 专利标题: System and method for performing efficient document scoring and clustering
- 专利标题(中): 执行有效文件评分和聚类的系统和方法
-
申请号: US10626984申请日: 2003-07-25
-
公开(公告)号: US07610313B2公开(公告)日: 2009-10-27
- 发明人: Kenji Kawai , Lynne Marie Evans
- 申请人: Kenji Kawai , Lynne Marie Evans
- 申请人地址: US WA Seattle
- 专利权人: Attenex Corporation
- 当前专利权人: Attenex Corporation
- 当前专利权人地址: US WA Seattle
- 代理商 Patrick J. S. Inouye; Krista A. Wittman
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
A system and method for providing efficient document scoring of concepts within a document set is described. A frequency of occurrence of at least one concept within a document retrieved from the document set is determined. A concept weight is analyzed reflecting a specificity of meaning for the at least one concept within the document. A structural weight is analyzed reflecting a degree of significance based on structural location within the document for the at least one concept. A corpus weight is analyzed inversely weighing a reference count of occurrences for the at least one concept within the document. A score associated with the at least one concept is evaluated as a function of the frequency, concept weight, structural weight, and corpus weight.
公开/授权文献
信息查询