发明授权
US08886660B2 Method and apparatus for tracking a change in a collection of web documents
有权
跟踪Web文档集合中的变化的方法和装置
- 专利标题: Method and apparatus for tracking a change in a collection of web documents
- 专利标题(中): 跟踪Web文档集合中的变化的方法和装置
-
申请号: US12027316申请日: 2008-02-07
-
公开(公告)号: US08886660B2公开(公告)日: 2014-11-11
- 发明人: Bernhard Dombrowski , Karl Klug , Michal Skubacz , Peter Suda , Jürgen Totzke , Cai-Nicolas Ziegler
- 申请人: Bernhard Dombrowski , Karl Klug , Michal Skubacz , Peter Suda , Jürgen Totzke , Cai-Nicolas Ziegler
- 申请人地址: DE Munich
- 专利权人: Siemens Enterprise Communications GmbH & Co. KG
- 当前专利权人: Siemens Enterprise Communications GmbH & Co. KG
- 当前专利权人地址: DE Munich
- 代理机构: Buchanan Ingersoll & Rooney PC
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
A method and an apparatus for tracking changes in a collection of web documents, for example, provided by a web site. The web documents are retrieved at a first assigned point in time and a second assigned point in time. Then a similarity measure for a combination of a retrieved web document at a first assigned point in time and a retrieved web document at a second assigned point in time is calculated for determining pairs of corresponding web documents. By comparing said calculated similarity measure of a pair of corresponding web documents with predetermined thresholds for the similarity measure a change in the content of the corresponding web document between the first assigned point in time and second assigned point in time is detected. Instead of referring to identifiers like URLs for web pages the content similarities of web pages are considered. The proposed strategy facilitates the work of marketing analysts.
公开/授权文献
信息查询