摘要:
A computer system for the indexing of data in which a heuristic determination function is applied to predict an efficient index updating approach. The system is able to update an index relating to a first data set by incrementally updating the index or by a rebuild of the index at the completion of the addition of a second set of data to the first set of data. The system applies a heuristic determination function to the characteristics of the first set of data, its index, and the second set of data, to predict whether an incremental update or a rebuild update of the index will result in a more efficient rebuild of the data. The system applies this approach to a restore and rollforward recovery or a data load operation to improve the efficiency of these operations.
摘要:
A system and associated method load an input data stream into a multi-dimensional clustering (MDC) table or other structure containing data clustered along one or more dimensions, by assembling blocks of data in a partial block cache in which each partial block is associated with a distinct logical cell. A minimum threshold number of partial blocks may be maintained. Partial blocks may be spilled from the partial block cache to make room for new logical cells. Last partial pages of spilled partial blocks may be stored in a partial page cache to limit I/O if the cell associated with a spilled block is encountered later in the input data stream. Buffers may be reassigned from the partial block cache to the partial page cache if the latter is filled. Parallelism may be employed for efficiency during sorting of input data subsets and during storage of blocks to secondary storage.
摘要:
A system joins predicate evaluated column bitmaps having varying lengths. The system includes a column unifier for querying column values with a predicate and generating an indicator bit for each of the column values that is then joined with the respective column value. The system also includes a bitmap generator for creating a column-major linear bitmap from the column values and indicator bits. The column unifier also determines an offset between adjacent indicator bits. The system also includes a converter for multiplying the column-major linear bitmap with a multiplier to shift the indicator bits into consecutive positions in the linear bitmap.
摘要:
A system and method autonomically reallocate memory among buffer pools to permit quick access to data. A simulated buffer pool extension (SBPX) is created for each buffer pool in a set of buffer pools. Data victimized from a buffer pool is represented in the associated SBPX. Requests for data that is not resident in a buffer pool but is represented in the associated SBPX are tallied. Periodically, an expected efficiency benefit of increasing the capacity of each buffer pool is determined from the tallies. Memory is reallocated from the buffer pool with the lowest expected efficiency benefit having remaining reallocatable memory to the buffer pool with the highest expected efficiency benefit having remaining reallocatable memory, until either one or both of the buffer pools exhausts its reallocatable memory. This process is repeated until all reallocatable memory has been reallocated, until only one buffer pool with reallocatable memory remains, or until all buffer pools with remaining reallocatable memory have substantially the same expected efficiency benefit.
摘要:
A method for storing database information includes storing a table having data values in a column major order. The data values are stored in a list of blocks. The method also includes assigning a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table. The data values that correspond to each other across a plurality of columns of the table have equivalent TSNs. The method also includes assigning each data value to a partition based on a representation of the data value. The method also includes assigning a tuple map value to each data value. The tuple map value identifies the partition in which each data value is located.
摘要:
A system joins predicate evaluated column bitmaps having varying lengths. The system includes a column unifier for querying column values with a predicate and generating an indicator bit for each of the column values that is then joined with the respective column value. The system also includes a bitmap generator for creating a column-major linear bitmap from the column values and indicator bits. The column unifier also determines an offset between adjacent indicator bits. The system also includes a converter for multiplying the column-major linear bitmap with a multiplier to shift the indicator bits into consecutive positions in the linear bitmap.
摘要:
A method, system and apparatus for data leak prevention. An information system, such as a database system, which has been configured for data leak protection in accordance with the present invention can include an IDS coupled to the information system and a data leak protection system configured to apply a data leak protection policy for result sets produced by the information system in response to a database query. The data leak protection policy can include a listing of data shapes and corresponding remedial measures. The data leak protection policy further can include consideration for metrics produced by the IDS.
摘要:
A workload specification, detailing specific queries and a frequency of execution of each of the queries, and a set of partitions, are obtained for the database, as inputs. A number of candidate tables are identified for the database, the tables having a plurality of attributes. A chosen attribute is allocated for each of the tables, to obtain a set of tables and a set of appropriate partitions for each of the tables.
摘要:
A method for storing database information includes storing a table having data values in a column major order. The data values are stored in a list of blocks. The method also includes assigning a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table. The data values that correspond to each other across a plurality of columns of the table have equivalent TSNs. The method also includes assigning each data value to a partition based on a representation of the data value. The method also includes assigning a tuple map value to each data value. The tuple map value identifies the partition in which each data value is located.
摘要:
In general, the disclosure is directed to techniques for choosing which pages to evict from the buffer pool to make room for caching additional pages in the context of a database table scan. A buffer pool is maintained in memory. A fraction of pages of a table to persist in the buffer pool are determined. A random number is generated as a decimal value of 0 to 1 for each page of the table cached in the buffer pool. If the random number generated for a page is less than the fraction, the page is persisted in the buffer pool. If the random number generated for a page is greater than the fraction, the page is included as a candidate for eviction from the buffer pool.