摘要:
A method, apparatus, and article of manufacture for a computer implemented recover/build index system. The recover/build index system builds a database index for a database file by scanning partitions of the database file in parallel to retrieve key values and their associated record identifier (rid) values. The recover/build index system then sorts the scanned key/rid values for each partition in parallel. Next, the recover/build index system performs one or more merges on the sorted key/rid values from all of the partitions to generate a single key/rid value stream. Finally, the recover/build index system builds the index using the single key/rid value stream.
摘要:
A multiprocessing system forms a data structure, such as by loading reorganizing, or recovering, while concurrently collecting various statistics about the data structure. The data structure may comprise tables and/or indices, for example. A first processing unit forms the data structure by assimilating data from one or more data sources into data rows, storing the rows in a buffer, and copying the rows from the buffer to the data structure. Concurrently with the forming step, the same or a second processing unit retrieves the rows from the buffer and applies a predetermined analysis to the rows to formulate statistics regarding the data structure.
摘要:
A technique for identifying changes in a data store connected to a computer. Initially, one or more interval changes are measured. Each interval change indicates an amount of change in the data store at an interval. Next, a data store change is estimated that indicates an amount of change in the data store across all of the intervals using each interval change.
摘要:
A method, apparatus, and article of manufacture of a computer-implemented parallel database loading system. The optimum number of tasks to be processed by the system is determined by identifying the memory constraints of the system, by identifying available processing capabilities, and by determining a number of load and sort processes to be started in parallel based on the identified memory constraints and processing capabilities. Optimizing the number of load and sort processes increases overall system processing speed.
摘要:
A technique for loading data into a data store connected to a computer. Under control of a main process, multiple agent load processes are started for loading data in parallel. The main process awaits receipt of a checkpoint signal from each agent load process. Then, upon receiving the checkpoint signal from each load process, the main process performs a checkpoint.
摘要:
A method, apparatus, and article of manufacture for a computer-implemented building indexes system. Indexes are built for a database that is stored in a data storage device coupled to a computer. An amount of available memory is determined. An amount of memory for use in transmitting data between extract, sort, and index build tasks is determined. Then, a number of sort tasks to be used to build indexes is determined based on the determined amount of available memory, the determined amount of memory for use in transmitting data between tasks, and task memory requirements.
摘要:
A method, apparatus, and article of manufacture for a computer-implemented repartitioning system. Data is repartitioned in a database stored on a data storage device connected to a computer. First, it is detected that a partitioning scheme for the data has been altered. Next, partitions that would be affected by the altered partitioning scheme are identified. Then, the identified partitions are reorganized based on the altered partitioning scheme.
摘要:
A method, apparatus, and article of manufacture for a computer implemented rebalancing system. Partitioned data is rebalanced in a database stored on a data storage device connected to a computer. Range values are redefined for each partition. Next, the data is reordered into the redefined ranges for the partitions.
摘要:
A method, apparatus and program storage device readable by a computer tangibly embodying a program of instructions executable by the computer is provided for reorganization of database data. The computer database reorganization method reorganizes one set of database data blocks at a time, allowing concurrent data manipulation. Method identifies a set of data blocks for reorganization in a sliding peephole mode, re-orders the set of data blocks and replaces the original set of data blocks with the re-ordered set of data blocks. The method include an overlapping peephole method, which chooses, for each set of data block to be reorganized, a next succeeding set of data blocks plus an overlap segment, wherein the overlap segment includes a set of empty pages other than intentionally specified free pages, and the overlap segment is a subset of the preceding set of data blocks.
摘要:
A method is disclosed that places data-intensive subprocesses in close physical and logical proximity to the facility responsible for storing the data, so that high efficiencies at reduced cost are achieved. In one specific example, new computer programs, termed adjuncts, are added and placed in a logical partition on a storage facility so that they can be invoked using appropriate commands issued on the I/O channel. Further, programs or changes are added to existing programs on the host machine, wherein such programs or changes discover the function extensions and invoke them to perform data processing.