发明授权
US08713021B2 Unsupervised document clustering using latent semantic density analysis
有权
使用潜在语义密度分析的无监督文档聚类
- 专利标题: Unsupervised document clustering using latent semantic density analysis
- 专利标题(中): 使用潜在语义密度分析的无监督文档聚类
-
申请号: US12831909申请日: 2010-07-07
-
公开(公告)号: US08713021B2公开(公告)日: 2014-04-29
- 发明人: Jerome R. Bellegarda
- 申请人: Jerome R. Bellegarda
- 申请人地址: US CA Cupertino
- 专利权人: Apple Inc.
- 当前专利权人: Apple Inc.
- 当前专利权人地址: US CA Cupertino
- 代理机构: Morrison & Foerster LLP
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
According to one embodiment, a latent semantic mapping (LSM) space is generated from a collection of a plurality of documents, where the LSM space includes a plurality of document vectors, each representing one of the documents in the collection. For each of the document vectors considered as a centroid document vector, a group of document vectors is identified in the LSM space that are within a predetermined hypersphere diameter from the centroid document vector. As a result, multiple groups of document vectors are formed. The predetermined hypersphere diameter represents a predetermined closeness measure among the document vectors in the LSM space. Thereafter, a group from the plurality of groups is designated as a cluster of document vectors, where the designated group contains a maximum number of document vectors among the plurality of groups.
公开/授权文献
信息查询