ENRICHED DOCUMENT REPRESENTATIONS USING AGGREGATED ANCHOR TEXT
    1.
    发明申请
    ENRICHED DOCUMENT REPRESENTATIONS USING AGGREGATED ANCHOR TEXT 审中-公开
    使用聚集的锚固文本增强文档表示

    公开(公告)号:US20100318533A1

    公开(公告)日:2010-12-16

    申请号:US12482377

    申请日:2009-06-10

    IPC分类号: G06F17/00 G06F17/30

    CPC分类号: G06F16/958

    摘要: A system and method for aggregating anchor text over the web graph and using the aggregated anchor text to enrich document representations. For a target page, its internal inlinks, which point to the target page and are within the site containing the target page, are identified first. Then external anchors that point to the internal inlinks from pages outside of the site are identified. Anchor text of the external anchors are collected, weighted, stored, and used to enrich document presentations. The method not only reduces the number of pages with no anchor text, but also adds lines of anchor text to URLs.

    摘要翻译: 一种用于在网络图上聚合锚文本并使用聚合锚文本来丰富文档表示的系统和方法。 对于目标页面,首先标识其指向目标页面并且在包含目标页面的站点内的内部链接。 然后识别指向站点外部的内部链接的外部锚点。 收集,加权,存储和使用外部锚点的锚文本来丰富文档演示。 该方法不仅减少了没有锚文本的页面数,而且还向URL添加了一些锚文本。