摘要:
Disclosed is a data processing system, a data processing system-implemented method and an article of manufacture for providing general user availability while integrity processing of rolled-in data is deferred and performed incrementally. The data processing system includes a data warehouse administration module for administering a data warehouse to include a table dividable into portions for containing rows of rolled-in data, a first and a second delimiter delimiting the start and the end respectively of each portion, a metadata element having an entry corresponding to the start and end delimiters delimiting each portion, a third delimiter for delimiting, between the first delimiter and the third delimiter, a sub-portion of the portion, and an operations management module having operation mechanisms for performing operations on the data warehouse responsive to the delimiters.
摘要:
The present system improves the performance of a query in a database system when a plan for the query comprises sorting an input that is at least partially sorted such that a slow materialization sort can be applied. The invention applies the slow materialization sort by determining a sequence of subsets in accordance with the partially sorted input. As each of the subsets is determined, the subset is output for further processing. Advantageously, the invention reduces the waiting period for obtaining results from a sorting operation under certain circumstances.
摘要:
There is disclosed a system and method for executing multiple distinct aggregate queries. In an embodiment, the method comprises: providing at least one Counting Bloom Filter for each distinct column of an input data stream; reviewing count values in the at least one Counting Bloom Filter for the existence of duplicates in each distinct column; and if necessary, using a distinct hash operator to remove duplicates from each distinct column of the input data stream, thereby removing the need for replicating the input data stream and minimizing distinct hash operator processing. Also, the use of Counting Bloom Filters for monitoring data streams allow an early duplicate removal of the input stream of data, resulting in savings in computation time and memory resources.
摘要:
There is disclosed a system and method for executing multiple distinct aggregate queries. In an embodiment, the method comprises: providing at least one Counting Bloom Filter for each distinct column of an input data stream; reviewing count values in the at least one Counting Bloom Filter for the existence of duplicates in each distinct column; and if necessary, using a distinct hash operator to remove duplicates from each distinct column of the input data stream, thereby removing the need for replicating the input data stream and minimizing distinct hash operator processing. Also, the use of Counting Bloom Filters for monitoring data streams allow an early duplicate removal of the input stream of data, resulting in savings in computation time and memory resources.
摘要:
The use of a centralized version table allows for efficient object switching. Rather than synchronizing all database agents to recognize a newly created file as containing the most recent version of a given object, database agents requiring access to the given object need only consult the centralized version table to learn file identity information. That is, the database agents consult the centralized version table to determine which of the files associated with a given object contain the most recent version of the given object. Mechanisms associated with the use of the centralized version table also provide for efficient recovery from a failure that has occurred during an object switching transaction.