Abstract:
Disclosed herein are system and method embodiments for generating a paged inverted index. An embodiment is generated by storing a first data structure and the second data structure in a plurality of pages, where the plurality of pages are stored in the one or more memories. The first data structure is stored in the plurality of pages and includes a plurality of value identifiers, where a value identifier corresponds to an offset. The second data structure stored in the plurality of pages includes a plurality of row positions, wherein a row position is at a location that corresponds to the offset in the first data structure and identifies a position of row in a table that stores data associated with the value ID.
Abstract:
In an executing database instance including a plurality of database nodes, creation of a backup of the executing database instance includes creation of a current savepoint in one of the plurality of database nodes by storing first modified pages of a cache of the database node in a datastore of the database node, transmitting a confirmation after storing the first modified pages, repeatedly identifying second modified pages of the cache and storing the identified second modified pages in the datastore, receiving an instruction to enter a critical phase and stopping the repeated identifying and storing in response to the instruction, blocking updates to the database node and transmitting a second confirmation, and receiving a second instruction and, in response to receiving the second instruction, identifying third modified pages of the cache and storing the third modified pages of the cache in the datastore. Pages associated with the current savepoint are identified and stored in the datastore, and the pages associated with the current savepoint are stored in a persistent media.
Abstract:
Disclosed herein are innovations in memory management and data recovery for systems that operate using storage class memory (SCM), such as non-volatile RAM (NVRAM). The disclosed innovations have particular application to production database systems, where reducing database downtime in the event of a system crash is highly desirable. Embodiments of the disclosed technology can address a variety of problems that exist during a system crash. For example, embodiments of the disclosed technology can be used to address the loss of the physical memory mapping and/or the loss of the CPU cache that typically occurs in the event of a system crash. Furthermore, embodiments of the disclosed technology can be used to prevent data inconsistency and/or memory leak problems that may arise in the event of a system crash.
Abstract:
Deleting a data record from the second level storage or main store is disclosed. A look-up is performed for the data record in the first level storage, where the data record is defined by a row identifier. If the row identifier is found in the first level storage, a look-up is performed for an updated row identifier representing an update of the data record in the second level storage and the main store, the update of the data record being defined by an updated row identifier. If the updated row identifier is found in the second level storage, an undo log is generated from the first level storage to invalidate a row identifier of the row identifier. A flag is generated representing an invalid updated row identifier, and a redo log is generated to restore the data record in the first level storage.
Abstract:
A method for processing a query may include receiving a query associated with one or more predicate columns and one or more aggregate columns. To respond to the query, one or more partial data pages including the one or more predicate columns but not the one or more aggregate columns may be loaded from disk to memory. For each partial data page, a first value occupying the one or more predicate columns may be evaluated to identify one or more rows satisfying a predicate associated with the query. A portion of a data page containing the aggregate columns may be loaded from disk into memory. A result of the query corresponding to a second value occupying the aggregate columns may be generated based on the portion of the data page loaded in the memory. Related systems and articles of manufacture are also provided.
Abstract:
A thread executing a task at a node in a multi-socket computing system may access a first data structure to obtain a first calibration dataset for the node. The first thread may generate a timestamp based on the first calibration dataset and a first quantity of time measured by a clock at the first node. The real-time duration of the task may be determined based on the timestamp. The first thread may recalibrate the first clock by at least generating, based on the first quantity of time measured by the clock and a second quantity of time measured by a wall clock of an operating system of the multi-socket computing system, a second calibration dataset. The first thread may update the first data structure to include the second calibration dataset while a second thread accesses a second data structure to obtain calibration data.
Abstract:
A method may include responding to a transaction by sending, to a first data partition participating in the transaction, a first request to set a first transaction control block at the first data partition to a preparing state. In response to the transaction affecting multiple data partitions, a second request to set a second transaction control block at a second data partition to the preparing state may be sent to the second data partition. A third request to add the first data partition and the second data partition as participants of the transaction may be sent to the transaction coordinator. The transaction coordinator may determine, based on a first response of the first data partition and a second response of the second data partition, an outcome of the transaction. The transaction may be rolled back if the first response and/or the second response indicate an inability to commit the transaction.
Abstract:
A system and method of query processing in a multi-level storage system having a unified table architecture. A query is received by a common query execution engine connected with the unified table architecture, the query specifying a data record. The common query execution engine performs a look-up for the data record based on the query at the first level storage structure. If the data record is not present at the first level storage structure, the common query execution engine performs separate look-ups in each of the second level storage structure and the main store.
Abstract:
A method for compressing columnar data may include generating, for a data column included in a data chunk, a dictionary enumerating, in a sorted order, a first set of unique values included in the first data column. A compression technique for generated a compressed representation of the data column having a fewest quantity of bytes may be identified based at least on the dictionary. The compression technique including a dictionary compression applying the dictionary and/or another compression technique. A compressed data chunk may be generated by applying the compression technique to compress the data column included in the data chunk. The compressed data chunk may be stored at a database in a variable-size persistent page whose size is allocated based on the size of the compressed representation of the data column. Related systems and articles of manufacture are also provided.
Abstract:
A method for caching partial data pages to support optimized transactional processing and analytical processing with minimal memory footprint may include loading, from disk to memory, a portion of a data page. The memory may include a first cache for storing partial data pages and a second cache for storing full data pages. The first portion of the data page may be loaded into the first cache. A data structure may be updated to indicate that the portion of the data page has been loaded into the first cache. When the data structure indicates that the data page has been loaded into the first cache in its entirety, transferring the data page from the first cache to the second cache. One or more queries may be performed using at least the portion of the data page loaded into the memory. Related systems and articles of manufacture are also provided.