Efficient large-scale filtering and/or sorting for querying of column based data encoded structures
    1.
    发明授权
    Efficient large-scale filtering and/or sorting for querying of column based data encoded structures 有权
    用于查询基于列的数据编码结构的高效大规模过滤和/或排序

    公开(公告)号:US08478775B2

    公开(公告)日:2013-07-02

    申请号:US12363637

    申请日:2009-01-30

    IPC分类号: G06F17/00 G06F7/00

    摘要: The subject disclosure relates to querying of column based data encoded structures enabling efficient query processing over large scale data storage, and more specifically with respect to complex queries implicating filter and/or sort operations for data over a defined window. In this regard, in various embodiments, a method is provided that avoids scenarios involving expensive sorting of a high percentage of, or all, rows, either by not sorting any rows at all, or by sorting only a very small number of rows consistent with or smaller than a number of rows associated with the size of the requested window over the data. In one embodiment, this is achieved by splitting an external query request into two different internal sub-requests, a first one that computes statistics about distribution of rows for any specified WHERE clauses and ORDER BY columns, and a second one that selects only the rows that match the window based on the statistics.

    摘要翻译: 主题公开涉及查询基于列的数据编码结构,其能够在大规模数据存储上进行有效的查询处理,更具体地涉及涉及在定义的窗口上涉及数据的过滤器和/或排序操作的复杂查询。 在这方面,在各种实施例中,提供了一种方法,其避免了通过不对任何行进行排序的方式来避免高百分比或全部行的昂贵排序的情况,或者仅通过仅排列非常小数量的与 或小于与数据上所请求的窗口大小相关联的行数。 在一个实施例中,这是通过将外部查询请求分割成两个不同的内部子请求来实现的,第一个是根据任何指定的WHERE子句和ORDER BY列计算关于行的分布的统计信息,第二个仅选择行 根据统计信息匹配窗口。

    EFFICIENT LARGE-SCALE FILTERING AND/OR SORTING FOR QUERYING OF COLUMN BASED DATA ENCODED STRUCTURES
    2.
    发明申请
    EFFICIENT LARGE-SCALE FILTERING AND/OR SORTING FOR QUERYING OF COLUMN BASED DATA ENCODED STRUCTURES 有权
    有效的大规模过滤和/或分类用于查询基于数据的数据编码结构

    公开(公告)号:US20100088315A1

    公开(公告)日:2010-04-08

    申请号:US12363637

    申请日:2009-01-30

    IPC分类号: G06F17/30

    摘要: The subject disclosure relates to querying of column based data encoded structures enabling efficient query processing over large scale data storage, and more specifically with respect to complex queries implicating filter and/or sort operations for data over a defined window. In this regard, in various embodiments, a method is provided that avoids scenarios involving expensive sorting of a high percentage of, or all, rows, either by not sorting any rows at all, or by sorting only a very small number of rows consistent with or smaller than a number of rows associated with the size of the requested window over the data. In one embodiment, this is achieved by splitting an external query request into two different internal sub-requests, a first one that computes statistics about distribution of rows for any specified WHERE clauses and ORDER BY columns, and a second one that selects only the rows that match the window based on the statistics.

    摘要翻译: 主题公开涉及查询基于列的数据编码结构,其能够在大规模数据存储上进行有效的查询处理,更具体地涉及涉及在定义的窗口上涉及数据的过滤器和/或排序操作的复杂查询。 在这方面,在各种实施例中,提供了一种方法,其避免了通过不对任何行进行排序的方式来避免高百分比或全部行的昂贵排序的情况,或者仅通过仅排列非常小数量的与 或小于与数据上所请求的窗口大小相关联的行数。 在一个实施例中,这是通过将外部查询请求分割成两个不同的内部子请求来实现的,第一个是根据任何指定的WHERE子句和ORDER BY列计算关于行的分布的统计信息,第二个仅选择行 根据统计信息匹配窗口。