- 专利标题: RECURSIVE AGGLOMERATIVE CLUSTERING OF TIME-STRUCTURED COMMUNICATIONS
-
申请号: US15972952申请日: 2018-05-07
-
公开(公告)号: US20180329989A1公开(公告)日: 2018-11-15
- 发明人: Viacheslav Seledkin , David Yan , Marina Chilingaryan
- 申请人: Findo, Inc.
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
An example method of document clustering comprises: representing each document of a plurality of documents by a vector comprising a first plurality of real values, wherein each real value of the first plurality of real values reflects a first frequency-based metric of a term comprised by the document; partitioning the plurality of documents into a first set of document clusters based on distances between vectors representing the documents; representing each document cluster of the first set of document clusters by a vector comprising a second plurality of real values, wherein each real value of the second plurality of real values reflects a second frequency-based metric of a term comprised by the document cluster; and partitioning the first set of document clusters into a second set of document clusters based on distances between vectors representing the document clusters of the first set of document clusters.