发明授权
US06611825B1 Method and system for text mining using multidimensional subspaces
有权
使用多维子空间进行文本挖掘的方法和系统
- 专利标题: Method and system for text mining using multidimensional subspaces
- 专利标题(中): 使用多维子空间进行文本挖掘的方法和系统
-
申请号: US09328888申请日: 1999-06-09
-
公开(公告)号: US06611825B1公开(公告)日: 2003-08-26
- 发明人: D. Dean Billheimer , Andrew James Booker , Michelle Keim Condliff , Mark Thomas Greaves , Fredrick Baden Holt , Anne Shu-Wan Kao , Daniel John Pierce , Stephen Robert Poteet , Yuan-Jye Wu
- 申请人: D. Dean Billheimer , Andrew James Booker , Michelle Keim Condliff , Mark Thomas Greaves , Fredrick Baden Holt , Anne Shu-Wan Kao , Daniel John Pierce , Stephen Robert Poteet , Yuan-Jye Wu
- 主分类号: G06N500
- IPC分类号: G06N500
摘要:
A text mining program is provided that allows a user to perform text mining operations, such as: information retrieval, term and document visualization, term and document clustering, term and document classification, summarization of individual documents and groups of documents, and document cross-referencing. This is accomplished by representing the text of a document collection using subspace transformations. This subspace transformation representation is performed by: constructing a term frequency matrix of the term frequencies for each of the documents, transforming the term frequencies for statistical purposes, and projecting the documents or the terms into a lower dimensional subspace. As the document collection is updated, the subspace is dynamically updated to reflect the new document collection.