Systems and methods for identifying user types using multi-modal clustering and information scent
    1.
    发明授权
    Systems and methods for identifying user types using multi-modal clustering and information scent 有权
    使用多模态聚类和信息气味识别用户类型的系统和方法

    公开(公告)号:US07260643B2

    公开(公告)日:2007-08-21

    申请号:US09820988

    申请日:2001-03-30

    IPC分类号: G06F15/173

    CPC分类号: G06F17/30867

    摘要: Techniques for determining user types based on multi-modal clustering are provided. The topology, content and usage of a document collection or web site is determined. The user paths are identified using longest repeating subsequence techniques and a multi-modal information need vector is determined for each significant user path. Multi-modal vectors for each document in the significant path, content, uniform resource locators, inlink and outlink multi-modal vectors are determined and combined based on path position and access frequency. Multi-modal clustering is performed based on a multi-modal similarity function and a specified measure of similarity using a type of multi-modal clustering such as K-means or wavefront clustering. The identified clusters may be further analyzed based on changes to the weighting of the corresponding content, url, inlinks and outlinks multi-modal feature vectors.

    摘要翻译: 提供了基于多模式聚类来确定用户类型的技术。 确定文档集合或网站的拓扑,内容和用法。 使用最长的重复子序列技术来识别用户路径,并且为每个重要用户路径确定多模态信息需求向量。 基于路径位置和访问频率确定并组合有效路径中的每个文档的多模态向量,内容,统一资源定位符,inlink和outlink多模态向量。 基于多模式相似度函数和使用诸如K均值或波前聚类的多模式聚类的类型的指定度量进行多模态聚类。 可以基于对相应内容,url,inlinks和outlinks多模态特征向量的权重的改变来进一步分析所识别的集群。