Searching documents for ranges of numeric values
    1.
    发明授权
    Searching documents for ranges of numeric values 有权
    搜索文件范围的数值

    公开(公告)号:US08655888B2

    公开(公告)日:2014-02-18

    申请号:US13335634

    申请日:2011-12-22

    IPC分类号: G06F7/00

    摘要: Provided are a method, system, and article of manufacture for searching documents for ranges of numeric values. Document identifiers for documents are accessed, wherein the documents include at least one value that is a member of a set of values. A number of posting lists are generated. Each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents including at least one value within the range of consecutive values associated with the posting list, and wherein each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored, wherein the posting lists are used to process a query on a range of values within the set of values.

    摘要翻译: 提供了用于搜索文件范围的数值的方法,系统和制品。 访问文档的文档标识符,其中文档包括作为一组值的成员的至少一个值。 生成多个发布列表。 每个发布列表与所述值集合内的连续值的范围相关联,并且包括用于文档的文档标识符,所述文档包括与所述发布列表相关联的连续值的范围内的至少一个值,并且其中每个文档标识符与 由文件标识符标识的文档中包含的值集合。 存储生成的发布列表,其中发布列表用于处理在该组值范围内的查询。

    Method, system, and program for handling redirects in a search engine
    2.
    发明授权
    Method, system, and program for handling redirects in a search engine 有权
    用于在搜索引擎中处理重定向的方法,系统和程序

    公开(公告)号:US08296304B2

    公开(公告)日:2012-10-23

    申请号:US10764771

    申请日:2004-01-26

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30882 G06F17/30864

    摘要: Disclosed is a method, system, and program for handling redirects in documents. At least one equivalence class that includes documents that are connected through a redirect. Cycles for each equivalence class are detected, wherein documents in a cycle are marked so that they are not indexed. Incomplete chains for each equivalence class are detected, wherein documents in an incomplete chain are marked so that they are not indexed. A representative for each equivalence class is selected.

    摘要翻译: 公开了一种用于处理文档中的重定向的方法,系统和程序。 至少有一个等价类,包括通过重定向连接的文档。 检测每个等价类的周期,其中标记周期中的文档,使得它们不被索引。 检测到每个等价类的不完整的链,其中不完整链中的文档被标记,使得它们不被索引。 选择每个等价类的代表。

    SYSTEM AND ARTICLE OF MANUFACTURE FOR SEARCHING DOCUMENTS FOR RANGES OF NUMERIC VALUES
    4.
    发明申请
    SYSTEM AND ARTICLE OF MANUFACTURE FOR SEARCHING DOCUMENTS FOR RANGES OF NUMERIC VALUES 失效
    用于搜索数值范围的文件的制造和制造

    公开(公告)号:US20080294634A1

    公开(公告)日:2008-11-27

    申请号:US12187344

    申请日:2008-08-06

    IPC分类号: G06F7/06 G06F17/30

    摘要: Provided are a system and article of manufacture for searching documents for ranges of numeric values. Document identifiers for documents include at least one value that is a member of a set of values. A number of posting lists is generated, wherein each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents including at least one value within the range of consecutive values associated with the posting list, and wherein each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored, wherein the posting lists are used to process a query on a range of values within the set of values. A query on a query range of values within the set of values is received and a determination is made of a minimum number of posting lists associated with consecutive values that together include the query range of values. The determined posting lists are merged to form a merged posting list including document identifiers of documents including values within the query range. The document identifiers in the merged posting list are returned.

    摘要翻译: 提供了用于搜索文件范围的数值的系统和制品。 文档的文档标识符至少包含一个值,它是一组值的成员。 生成多个发布列表,其中每个发布列表与该组值范围内的连续值的范围相关联,并且包括文档的文档标识符,其包括与发布列表相关联的连续值的范围内的至少一个值,并且其中 每个文档标识符与由文档标识符标识的文档中包括的值集合中的一个值相关联。 存储生成的发布列表,其中发布列表用于处理在该组值范围内的查询。 接收关于该值集合中的值的查询范围的查询,并且确定与连续值相关联的一起包括查询范围值的连续值的最小发布列表数。 确定的发布列表被合并以形成合并的发布列表,包括包括查询范围内的值的文档的文档标识符。 返回合并发布列表中的文档标识符。

    Method for searching documents for ranges of numeric values
    6.
    发明授权
    Method for searching documents for ranges of numeric values 有权
    搜索文件数值范围的方法

    公开(公告)号:US07461064B2

    公开(公告)日:2008-12-02

    申请号:US10949473

    申请日:2004-09-24

    IPC分类号: G06F7/00 G06F17/30

    摘要: Provided are a method, system, and program for searching documents for ranges of numeric values. Document identifiers for documents are accessed, wherein the documents include at least one value that is a member of a set of values. A number of posting lists are generated. Each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents having values within the range of consecutive values associated with the posting list. Each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored.

    摘要翻译: 提供了用于在数值范围内搜索文档的方法,系统和程序。 访问文档的文档标识符,其中文档包括作为一组值的成员的至少一个值。 生成多个发布列表。 每个发布列表与该组值范围内的连续值的范围相关联,并且包括具有与发布列表相关联的连续值范围内的值的文档的文档标识符。 每个文档标识符与由文档标识符标识的文档中包括的值集合中的一个值相关联。 生成的发布列表被存储。

    A Generic Architecture for Indexing Document Groups in an Inverted Text Index
    7.
    发明申请
    A Generic Architecture for Indexing Document Groups in an Inverted Text Index 有权
    用于在反文本索引中索引文档组的通用架构

    公开(公告)号:US20060155739A1

    公开(公告)日:2006-07-13

    申请号:US10905604

    申请日:2005-01-12

    IPC分类号: G06F17/00

    CPC分类号: G06F17/30622

    摘要: A method for indexing a plurality of documents, that includes a plurality of duplicate documents, first identifies one or more duplicate groups of documents from among the plurality of documents. Then, one index of content for the duplicate group is created instead of indexing the content from every document within the duplicate group. However, in contrast to the content index, an index of metadata for each of the documents in the duplicate group is created. Thus the content of each duplicate group is indexed only once, while a search engine using such indexing techniques retains the capability to answer queries as if the duplicated content was indexed for each document of the group.

    摘要翻译: 一种用于索引多个文档(包括多个重复文档)的方法首先从多个文档中识别一个或多个文档重复组。 然后,创建重复组的一个内容索引,而不是从重复组中的每个文档索引内容。 然而,与内容索引相反,创建了重复组中的每个文档的元数据索引。 因此,每个重复组的内容仅被索引一次,而使用这种索引技术的搜索引擎保留回答查询的能力,就好像为组中的每个文档索引了重复的内容。

    Searching documents for ranges of numeric values
    9.
    发明授权
    Searching documents for ranges of numeric values 有权
    搜索文件范围的数值

    公开(公告)号:US08271498B2

    公开(公告)日:2012-09-18

    申请号:US12190495

    申请日:2008-08-12

    IPC分类号: G06F7/00

    摘要: Provided are a method, system, and article of manufacture for searching documents for ranges of numeric values. Document identifiers for documents are accessed, wherein the documents include at least one value that is a member of a set of values. A number of posting lists are generated. Each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents including at least one value within the range of consecutive values associated with the posting list, and wherein each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored, wherein the posting lists are used to process a query on a range of values within the set of values.

    摘要翻译: 提供了用于搜索文件范围的数值的方法,系统和制品。 访问文档的文档标识符,其中文档包括作为一组值的成员的至少一个值。 生成多个发布列表。 每个发布列表与所述值集合内的连续值的范围相关联,并且包括用于文档的文档标识符,所述文档包括与所述发布列表相关联的连续值的范围内的至少一个值,并且其中每个文档标识符与 由文件标识符标识的文档中包含的值集合。 存储生成的发布列表,其中发布列表用于处理在该组值范围内的查询。

    METHOD, SYSTEM AND ARTICLE OF MANUFACTURE FOR SEARCHING DOCUMENTS FOR RANGES OF NUMERIC VALUES
    10.
    发明申请
    METHOD, SYSTEM AND ARTICLE OF MANUFACTURE FOR SEARCHING DOCUMENTS FOR RANGES OF NUMERIC VALUES 有权
    用于搜索数值范围的文档的制造方法,系统和文章

    公开(公告)号:US20080301130A1

    公开(公告)日:2008-12-04

    申请号:US12190495

    申请日:2008-08-12

    IPC分类号: G06F17/30

    摘要: Provided are a method, system, and article of manufacture for searching documents for ranges of numeric values. Document identifiers for documents are accessed, wherein the documents include at least one value that is a member of a set of values. A number of posting lists are generated. Each posting list is associated with a range of consecutive values within the set of values and includes document identifiers for documents including at least one value within the range of consecutive values associated with the posting list, and wherein each document identifier is associated with one value in the set of values included in the document identified by the document identifier. The generated posting lists are stored, wherein the posting lists are used to process a query on a range of values within the set of values.

    摘要翻译: 提供了用于搜索文件范围的数值的方法,系统和制品。 访问文档的文档标识符,其中文档包括作为一组值的成员的至少一个值。 生成多个发布列表。 每个发布列表与该组值范围内的连续值的范围相关联,并且包括用于文档的文档标识符,其包括与发布列表相关联的连续值的范围内的至少一个值,并且其中每个文档标识符与 由文件标识符标识的文档中包含的值集合。 存储生成的发布列表,其中发布列表用于处理在该组值范围内的查询。