Fast search with very large result set
    1.
    发明授权
    Fast search with very large result set 有权
    快速搜索非常大的结果集

    公开(公告)号:US07979421B2

    公开(公告)日:2011-07-12

    申请号:US11960598

    申请日:2007-12-19

    IPC分类号: G06F7/00 G06F17/30

    摘要: Methods and apparatus, including computer systems and program products, for executing a query on a subset of data, for example, to facilitate a fast search with a very large result set. In one general aspect, a method of executing a query includes receiving a query for execution on data in the data repository; generating an estimate of a number of results of the query; defining a subset of data in the data repository; determining whether to execute the query on the subset of the data; executing the query on the subset of the data to generate a partial set of results if the query is to be executed on the subset of the data, otherwise executing the query on the data repository to generate a complete set of results; and providing query results.

    摘要翻译: 用于执行关于数据子集的查询的方法和装置,包括计算机系统和程序产品,以便于用非常大的结果集的快速搜索。 在一个一般方面,执行查询的方法包括:接收对数据存储库中的数据执行的查询; 产生查询结果的数量估计值; 定义数据存储库中的数据子集; 确定是否对数据子集执行查询; 如果要在数据的子集上执行查询,则对数据的子集执行查询以生成部分结果集合,否则在数据存储库上执行查询以生成完整的一组结果; 并提供查询结果。

    Efficient calculation of sets of distinct results in an information retrieval service
    2.
    发明授权
    Efficient calculation of sets of distinct results in an information retrieval service 有权
    在信息检索服务中有效计算不同结果的集合

    公开(公告)号:US08027969B2

    公开(公告)日:2011-09-27

    申请号:US11435149

    申请日:2006-05-15

    IPC分类号: G06F7/00

    摘要: Systems and methods are provided for efficient calculation of sets of distinct results in an information retrieval service. A query is received having at least one requested attribute and one or more conditions. For each row identifier in a database table that matches the one or more conditions, a tuple of value identifiers having an entry for each requested attribute is calculated. A unique number is generated and assigned to the tuple for each distinct combination of the value identifiers. Duplicate entries in the tuple listing are identified and removed, so that a result set provides only distinct results.

    摘要翻译: 提供了系统和方法,用于在信息检索服务中有效计算不同结果的集合。 接收到具有至少一个所请求的属性和一个或多个条件的查询。 对于与一个或多个条件匹配的数据库表中的每个行标识符,计算具有每个请求属性的条目的值标识符元组。 为值标识符的每个不同组合生成唯一的数字并将其分配给元组。 标识和删除元组列表中的重复条目,以便结果集仅提供不同的结果。

    Method for calculating distributed joins in main memory with minimal communicaton overhead
    3.
    发明授权
    Method for calculating distributed joins in main memory with minimal communicaton overhead 有权
    以最小的通信开销计算主存储器中的分布式连接的方法

    公开(公告)号:US08046377B2

    公开(公告)日:2011-10-25

    申请号:US11018697

    申请日:2004-12-20

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30498 G06F17/30545

    摘要: A method of executing a distributed join query for a set of documents includes communication between a first server and a second server. In the first server, a first tuple list is generated from a first list of documents matching a precondition part of the query. A first set of value identifiers of attributes associated with the first list of documents is extracted from the first tuple list. A first set of dictionary keys is generated from the set of value identifiers. Then, the first set of dictionary keys is sent with a join condition attribute to a second server. In the second server, the first set of value identifiers is converted to a second set of value identifiers of attributes associated with the second server based on the set of dictionary keys. Then, a lookup of documents is performed based on the second set of value identifiers.

    摘要翻译: 对一组文档执行分布式连接查询的方法包括第一服务器和第二服务器之间的通信。 在第一个服务器中,从与查询的前提条件匹配的第一个文档列表中生成第一个元组列表。 从第一元组列表中提取与第一文档列表相关联的属性的第一组值标识符。 从一组值标识符生成第一组字典键。 然后,将第一组字典密钥与连接条件属性一起发送到第二服务器。 在第二服务器中,第一组值标识符被转换为基于字典密钥集合的与第二服务器相关联的属性的第二组值标识符。 然后,基于第二组值标识符执行对文档的查找。

    Fast search with very large result set
    4.
    发明授权
    Fast search with very large result set 有权
    快速搜索非常大的结果集

    公开(公告)号:US07337164B2

    公开(公告)日:2008-02-26

    申请号:US10816011

    申请日:2004-03-31

    IPC分类号: G06F7/00 G06F17/30

    摘要: Methods and apparatus, including computer systems and program products, for executing a query on a subset of data, for example, to facilitate a fast search with a very large result set. In one general aspect, a method of executing a query includes receiving a query for execution on data in the data repository; generating an estimate of a number of results of the query; defining a subset of data in the data repository; determining whether to execute the query on the subset of the data; executing the query on the subset of the data to generate a partial set of results if the query is to be executed on the subset of the data, otherwise executing the query on the data repository to generate a complete set of results; and providing query results.

    摘要翻译: 用于执行关于数据子集的查询的方法和装置,包括计算机系统和程序产品,以便于用非常大的结果集进行快速搜索。 在一个一般方面,执行查询的方法包括:接收对数据存储库中的数据执行的查询; 产生查询结果的数量估计值; 定义数据存储库中的数据子集; 确定是否对数据子集执行查询; 如果要在数据的子集上执行查询,则对数据的子集执行查询以生成部分结果集合,否则在数据存储库上执行查询以生成完整的一组结果; 并提供查询结果。