摘要:
A method, apparatus, and system for policy driven data placement and information lifecycle management in a database management system are provided. A user or database application can specify declarative policies that define the movement and transformation of stored database objects. The policies are associated with a database object and may also be inherited. A policy defines, for a database object, an archiving action to be taken, a scope, and a condition before the archiving action is triggered. Archiving actions may include compression, data movement, table clustering, and other actions to place the database object into an appropriate storage tier for a lifecycle phase of the database object. Conditions based on access statistics can be specified at the row level and may use segment or block level heatmaps. Policy evaluation occurs periodically in the background, with actions queued as tasks for a task scheduler.
摘要:
A database server stores compressed units in data blocks of a database. A table (or data from a plurality of rows thereof) is first compressed into a “compression unit” using any of a wide variety of compression techniques. The compression unit is then stored in one or more data block rows across one or more data blocks. As a result, a single data block row may comprise compressed data for a plurality of table rows, as encoded within the compression unit. Storage of compression units in data blocks maintains compatibility with existing data block-based databases, thus allowing the use of compression units in preexisting databases without modification to the underlying format of the database. The compression units may, for example, co-exist with uncompressed tables. Various techniques allow a database server to optimize access to data in the compression unit, so that the compression is virtually transparent to the user.
摘要:
A method, apparatus, and system for tracking row and object database activity into block level heatmaps is provided. Database activity including reads, writes, and creates can be tracked by a database management system at the finest possible level of granularity, or the row and object level. To efficiently record the tracked database activity, a two-part structure is described for writing the activity into heatmaps. A hierarchical in-memory component may use a dynamically allocated sparse pool of bitmap blocks. Periodically, the in-memory component is persisted to a stored representation component, sharable with multiple database instances, which may include consolidated last access times and/or a history of heatmap snapshots to reflect access over time. The heatmaps may then be externalized to database users and applications to provide and support a variety of features.
摘要:
Techniques are described herein for executing queries on distinct portions of a database object that has been separate into chunks and distributed across the volatile memories of a plurality of nodes in a clustered database system. The techniques involve redistributing the in-memory database object portions on changes to the clustered database system. Each node may maintain a mapping indicating which nodes in the clustered database system store which chunks, and timestamps indicating when each mapping entry was created or updated. A query coordinator may use the timestamps to select a database server instance with local in memory access to data required by a portion of a query to process that portion of the query.
摘要:
A method for accelerating queries using dynamically generated columnar data in a flash cache is provided. In an embodiment, a method comprises a storage device receiving a first request for data that is stored in the storage device in a base major format in one or more primary storage devices. The storage device comprises a cache. The base major format is any one of: a row-major format, a column-major format and a hybrid-columnar format. Based on first one or more criteria, it is determined whether to rewrite the data into rewritten data in a rewritten major format. In response to determining to rewrite the data into rewritten data in a rewritten major format, the storage device rewrites at least a portion of the data into particular rewritten data in the rewritten major format. The rewritten data is stored in the cache.
摘要:
Techniques are described herein for executing queries on distinct portions of a database object that has been separate into chunks and distributed across the volatile memories of a plurality of nodes in a clustered database system. The techniques involve receiving a query that requires work to be performed on data that resides in a plurality of on disk extents. A parallel query coordinator that is aware of the in-memory distribution divides the work into granules that align with the in-memory separation. The parallel query coordinator then sends each granule to the database server instance with local in memory access to the data required by the granule and aggregates the results to respond to the query.
摘要:
Techniques are described for materializing pre-computed results of expressions. In an embodiment, a set of one or more column units are stored in volatile or non-volatile memory. Each column unit corresponds to a column that belongs to an on-disk table within a database managed by a database server instance and includes data items from the corresponding column. A set of one or more virtual column units, and data that associates the set of one or more column units with the set of one or more virtual column units, are also stored in memory. The set of one or more virtual column units includes a particular virtual column unit storing results that are derived by evaluating an expression on at least one column of the on-disk table.
摘要:
A method, apparatus, and system for automated information lifecycle management using low access patterns in a database management system are provided. A user or the database can store policy data that defines an archiving action when meeting an activity-level condition on one or more database objects. The archiving actions may include compression, data movement, and other actions to place the database object in an appropriate storage tier for a lifecycle phase of the database object. The activity-level condition may specify the database object meeting a low access pattern, optionally for a minimum time period. Various criteria including access statistics for the database object and cost characteristics of current and target compression levels or storage tiers may be considered to determine the meeting of the activity-level condition. The policies may be evaluated on an adjustable periodic basis and may utilize a task scheduler for minimal performance impact.
摘要:
Techniques are provided for sharding objects across different compute nodes. In one embodiment, a database server instance generates, for an object, a plurality of in-memory chunks including a first in-memory chunk and a second in-memory chunk, where each in-memory chunk includes a different portion of the object. The database server instance assigns each in-memory chunk to one of a plurality of computer nodes including the first in-memory chunk to a first compute node and a second in-memory chunk to a second local memory of a second compute node. The database server instance stores an in-memory map that indicates a memory location for each in-memory chunk. The in-memory map indicates that the first in-memory chunk is located in the first local memory of the first compute node and that the second in-memory chunk is located in the second local memory of the second compute node.
摘要:
A method and apparatus for efficiently processing data in various formats in a single instruction multiple data (“SIMD”) architecture is presented. Specifically, a method to unpack a fixed-width bit values in a bit stream to a fixed width byte stream in a SIMD architecture is presented. A method to unpack variable-length byte packed values in a byte stream in a SIMD architecture is presented. A method to decompress a run length encoded compressed bit-vector in a SIMD architecture is presented. A method to return the offset of each bit set to one in a bit-vector in a SIMD architecture is presented. A method to fetch bits from a bit-vector at specified offsets relative to a base in a SIMD architecture is presented. A method to compare values stored in two SIMD registers is presented.