发明申请
US20050021508A1 Method and apparatus for calculating similarity among documents
失效
用于计算文件之间相似度的方法和装置
- 专利标题: Method and apparatus for calculating similarity among documents
- 专利标题(中): 用于计算文件之间相似度的方法和装置
-
申请号: US10838231申请日: 2004-05-05
-
公开(公告)号: US20050021508A1公开(公告)日: 2005-01-27
- 发明人: Tadataka Matsubayashi , Natsuko Sugaya , Michio Iijima , Yuichi Ogawa , Yuuki Watanabe , Shinya Yamamoto , Tsuyoshi Sudou
- 申请人: Tadataka Matsubayashi , Natsuko Sugaya , Michio Iijima , Yuichi Ogawa , Yuuki Watanabe , Shinya Yamamoto , Tsuyoshi Sudou
- 优先权: JP2003-200193 20030723
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
Information that individual elements (characteristic character rings) indicative of characteristics of a registered document appear in the registered document is stored in advance. When calculating similarity of the registered document, a query designated by a searcher is analyzed. The query is represented by a characteristic vector having the individual elements which take the relation between a plurality of words into consideration. Pieces of appearance information of the individual words contained in the query are counted. The counted appearance information is compared with a searching index to calculate similarity between documents.
公开/授权文献
信息查询