发明授权
- 专利标题: Document categorizing method, document categorizing apparatus, and storage medium on which a document categorization program is stored
- 专利标题(中): 文档分类方法,文档分类装置和存储有文档分类程序的存储介质
-
申请号: US09762126申请日: 2000-06-02
-
公开(公告)号: US07213205B1公开(公告)日: 2007-05-01
- 发明人: Shinji Miwa , Michihiro Nagaishi
- 申请人: Shinji Miwa , Michihiro Nagaishi
- 申请人地址: JP Tokyo
- 专利权人: Seiko Epson Corporation
- 当前专利权人: Seiko Epson Corporation
- 当前专利权人地址: JP Tokyo
- 代理商 Rosalio Haro
- 优先权: JP11-158498 19990604; JP11-212501 19990727
- 国际申请: PCT/JP00/03625 WO 20000602
- 国际公布: WO00/75810 WO 20001214
- 主分类号: G06F17/00
- IPC分类号: G06F17/00
摘要:
A document categorizing apparatus includes a sentence analyzer 12 for analyzing a plurality of documents to detect titles thereof; a feature element extractor 13 for extracting feature elements from the titles detected by the sentence analyzer 12 from the respective documents; feature table generating means 14 for generating a feature table representing the relationships between the feature elements extracted from the title and the documents including the feature elements; a document categorizing unit 15 for categorizing the documents into a plurality of clusters according to semantic similarity on the basis of the content of the feature table; a categorization result storage unit 16 for storing the clusters created by the document categorization unit 15; a cluster merging unit 2 for performing a cluster merging process upon the clusters stored in the categorization result storage unit 6; and an output control unit 31 for outputting the result of the cluster merging process to a display unit 32.
信息查询