System and method for gradually bringing rolled in data online with incremental deferred integrity processing
    1.
    发明授权
    System and method for gradually bringing rolled in data online with incremental deferred integrity processing 失效
    系统和方法逐渐使用增量延迟完整性处理在网上滚动数据

    公开(公告)号:US07302441B2

    公开(公告)日:2007-11-27

    申请号:US10896154

    申请日:2004-07-20

    IPC分类号: G06F17/00

    摘要: Disclosed is a data processing system, a data processing system-implemented method and an article of manufacture for providing general user availability while integrity processing of rolled-in data is deferred and performed incrementally. The data processing system includes a data warehouse administration module for administering a data warehouse to include a table dividable into portions for containing rows of rolled-in data, a first and a second delimiter delimiting the start and the end respectively of each portion, a metadata element having an entry corresponding to the start and end delimiters delimiting each portion, a third delimiter for delimiting, between the first delimiter and the third delimiter, a sub-portion of the portion, and an operations management module having operation mechanisms for performing operations on the data warehouse responsive to the delimiters.

    摘要翻译: 公开了一种数据处理系统,数据处理系统实现的方法和用于提供一般用户可用性的制品,并且递推数据的完整性处理被延迟并执行。 数据处理系统包括数据仓库管理模块,用于管理数据仓库以将可分割的表分成用于包含滚动数据行的部分,分隔每个部分的开始和结束的第一和第二定界符,元数据 元素,其具有对应于限定每个部分的开始和结束分隔符的条目,用于在第一定界符和第三分隔符之间划分第三分隔符的第三分隔符,该部分的子部分,以及具有用于执行操作的操作机制的操作管理模块 数据仓库响应分隔符。

    System and method for multiple distinct aggregate queries
    3.
    发明授权
    System and method for multiple distinct aggregate queries 有权
    多个不同聚合查询的系统和方法

    公开(公告)号:US08005868B2

    公开(公告)日:2011-08-23

    申请号:US12044348

    申请日:2008-03-07

    CPC分类号: G06F17/30489

    摘要: There is disclosed a system and method for executing multiple distinct aggregate queries. In an embodiment, the method comprises: providing at least one Counting Bloom Filter for each distinct column of an input data stream; reviewing count values in the at least one Counting Bloom Filter for the existence of duplicates in each distinct column; and if necessary, using a distinct hash operator to remove duplicates from each distinct column of the input data stream, thereby removing the need for replicating the input data stream and minimizing distinct hash operator processing. Also, the use of Counting Bloom Filters for monitoring data streams allow an early duplicate removal of the input stream of data, resulting in savings in computation time and memory resources.

    摘要翻译: 公开了一种用于执行多个不同聚合查询的系统和方法。 在一个实施例中,该方法包括:为输入数据流的每个不同列提供至少一个计数布隆过滤器; 在每个不同列中存在重复项,查看至少一个计数布隆值过滤器中的计数值; 并且如果需要,使用不同的散列算子来从输入数据流的每个不同的列去除重复,从而消除对复制输入数据流的需要并使不同的散列算子处理最小化。 此外,使用计数布隆过滤器监控数据流可以早日重复删除输入数据流,从而节省计算时间和内存资源。

    SYSTEM AND METHOD FOR MULTIPLE DISTINCT AGGREGATE QUERIES
    4.
    发明申请
    SYSTEM AND METHOD FOR MULTIPLE DISTINCT AGGREGATE QUERIES 有权
    多重异构群算法的系统与方法

    公开(公告)号:US20090228433A1

    公开(公告)日:2009-09-10

    申请号:US12044348

    申请日:2008-03-07

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30489

    摘要: There is disclosed a system and method for executing multiple distinct aggregate queries. In an embodiment, the method comprises: providing at least one Counting Bloom Filter for each distinct column of an input data stream; reviewing count values in the at least one Counting Bloom Filter for the existence of duplicates in each distinct column; and if necessary, using a distinct hash operator to remove duplicates from each distinct column of the input data stream, thereby removing the need for replicating the input data stream and minimizing distinct hash operator processing. Also, the use of Counting Bloom Filters for monitoring data streams allow an early duplicate removal of the input stream of data, resulting in savings in computation time and memory resources.

    摘要翻译: 公开了一种用于执行多个不同聚合查询的系统和方法。 在一个实施例中,该方法包括:为输入数据流的每个不同列提供至少一个计数布隆过滤器; 在每个不同列中存在重复项,查看至少一个计数布隆值过滤器中的计数值; 并且如果需要,使用不同的散列算子来从输入数据流的每个不同的列去除重复,从而消除对复制输入数据流的需要并使不同的散列算子处理最小化。 此外,使用计数布隆过滤器监控数据流可以早日重复删除输入数据流,从而节省计算时间和内存资源。