发明授权
US07707229B2 Unsupervised detection of web pages corresponding to a similarity class 有权
对相似性类别对应的网页进行无监督检测

  • 专利标题: Unsupervised detection of web pages corresponding to a similarity class
  • 专利标题(中): 对相似性类别对应的网页进行无监督检测
  • 申请号: US11955129
    申请日: 2007-12-12
  • 公开(公告)号: US07707229B2
    公开(公告)日: 2010-04-27
  • 发明人: Mahesh Tiyyagura
  • 申请人: Mahesh Tiyyagura
  • 申请人地址: US CA Sunnyvale
  • 专利权人: Yahoo! Inc.
  • 当前专利权人: Yahoo! Inc.
  • 当前专利权人地址: US CA Sunnyvale
  • 代理机构: Weaver Austin Villeneuve Sampson LLP
  • 主分类号: G06F17/30
  • IPC分类号: G06F17/30
Unsupervised detection of web pages corresponding to a similarity class
摘要:
A method of detecting web pages belonging to at least one similarity class from a plurality of web pages includes determining clusters of the plurality of web pages based on characteristics of the content of the web pages. For each of the determined clusters, at least one metric is determined indicative of similarity among resource locators associated with the web pages of that cluster. A determination of web pages belonging to the at least one similarity class is based on the determined clusters and the determined similarity metrics.
信息查询
0/0