-
公开(公告)号:US20100082573A1
公开(公告)日:2010-04-01
申请号:US12235798
申请日:2008-09-23
IPC分类号: G06F17/30
CPC分类号: G06F16/58 , G06F16/48 , G06F16/81 , G06F16/958
摘要: Methods in computer-readable media for searching a large volume of documents is provided. In embodiments, the plurality of related documents are consolidated by a web host into a synthetic search document. The synthetic search document includes a set of descriptive information for each web page consolidated into the synthetic search document. Each set of descriptive information is associated with a subpart identifier that includes information that allows a search engine to provide a link to navigate to an individual document. Web pages consolidated into a synthetic search document may be edited to include an indication that that web page is not to be individually searched or indexed by a search engine. Similarly, the synthetic search document may be designated as a synthetic search document by information included on it.
摘要翻译: 提供了一种用于搜索大量文档的计算机可读介质中的方法。 在实施例中,多个相关文档被网络主机合并成合成搜索文档。 合成搜索文档包括合并到合成搜索文档中的每个网页的一组描述性信息。 每组描述性信息与包括允许搜索引擎提供链接以导航到单个文档的信息的子部分标识符相关联。 合并到合成搜索文档中的网页可以被编辑成包括该网页不被搜索引擎单独搜索或索引的指示。 类似地,可以通过包括在其中的信息将合成搜索文档指定为合成搜索文档。