摘要:
Described is an approach for recovering the failure of a transaction. According to the approach, a first change and a third change is made to a first resource and a second change is made to a second resource. The second change was made after the first but before the third. However, to recover the failure of the transaction, a recovery record for the third change is applied before the recovery record for the second change. Also described is an approach involving establishing links that link together a set of undo records that contain changes made to the particular resource. Also described is an approach for applying two or more undo records in parallel.
摘要:
An array update operation which specifies number of (row-identifier, value) pairs for updating rows in a table of a database is implemented as follows. A block-identifier of a block (on disk) that holds a row identified by a row-identifier in a specified pair is looked up using a database index, and the block-identifier thus found is stored in a structure. Use of a row-identifier to look up the corresponding block-identifier, and the storage of the block-identifier in the structure are repeatedly performed, for each of several specified pairs. Next, a vector read is performed, to read and store in a cache, each block identified by a block-identifier in the structure, and all the blocks that have been read are stored in the cache during a single function call. Thereafter, rows identified in specified pairs are modified, in blocks currently in the cache, using the values in the specified pairs.
摘要:
Techniques are provided for providing a data item to a transaction in a multi-versioning system in which the data item may exist on multiple versions of a data block, and were versioning is performed at the granularity of the data block. According to one aspect of the invention, the technique involves locating, within volatile memory, a first version of a data block that includes a first version of the data item. It is then determined whether the first version of the data item is useable by the transaction without respect to whether the first version of the data block is generally useable by the transaction. If the first version of the data item is usable by the transaction, then the data item is established as a candidate that can be provided to the transaction. Thus, the data item within a block may be considered a candidate to be provided to a transaction even when the version of the data block on which the data item resides would otherwise disqualify the data block from being seen by that transaction. If the first version of the data item is not usable by the transaction, then a version of the data item that is usable by the transaction is obtained from a second version of the data block that is different from the first version.
摘要:
A distributed object is hosted in a distributed system by invoking a global hash function on a generated name for the distributed object. The name for the distributed object is generated with knowledge of the global hash function so that the result of invoking the global hash function upon the name results in the selection of a predetermined node in the distributed system. The predetermined node may be selected based on the affinity of the node to the distributed object for reducing network messaging and communications overhead.
摘要:
Disclosed are methods, systems, and computer program products for processing a file which include using a computer system that is programmed for performing a process of receiving the file in response to a request for storing the file, determining whether a database already contains the file, and storing the file in the database if the database does not already contain the file. The process may alternatively include receiving the file in response to a request for storing the file, determining whether a database already contains the file, and storing the file without storing the received file if the database already contains the file. The process may also alternatively include receiving the file in response to a request for storing the file in a database, separating the file into a plurality of portions, and storing the plurality of portions so each of the plurality of portions can be individually accessed.
摘要:
For automatic data placement of database data, a plurality of access-tracking data is maintained. The plurality of access-tracking data respectively corresponds to a plurality of data rows that are managed by a database server. While the database server is executing normally, it is automatically determined whether a data row, which is stored in first one or more data blocks, has been recently accessed based on the access-tracking data that corresponds to that data row. After determining that the data row has been recently accessed, the data row is automatically moved from the first one or more data blocks to one or more hot data blocks that are designated for storing those data rows, from the plurality of data rows, that have been recently accessed.
摘要:
A method and apparatus is provided for optimizing queries received by a database system that relies on an intelligent data storage server to manage storage for the database system. Storing compression units in hybrid columnar format, the storage manager evaluates simple predicates and only returns data blocks containing rows that satisfy those predicates. The returned data blocks are not necessarily stored persistently on disk. That is, the storage manager is not limited to returning disc block images. The hybrid columnar format enables optimizations that provide better performance when processing typical database workloads including both fetching rows by identifier and performing table scans.
摘要:
Techniques are described herein for automatically selecting the compression techniques to be used on tabular data. A compression analyzer gives users high-level control over the selection process without requiring the user to know details about the specific compression techniques that are available to the compression analyzer. Users are able to specify, for a given set of data, a “balance point” along the spectrum between “maximum performance” and “maximum compression”. The point thus selected is used by the compression analyzer in a variety of ways. For example, in one embodiment, the compression analyzer uses the user-specified balance point to determine which of the available compression techniques qualify as “candidate techniques” for the given set of data. The compression analyzer selects the compression technique to use on a set of data by actually testing the candidate compression techniques against samples from the set of data. After testing the candidate compression techniques against the samples, the resulting compression ratios are compared. The compression technique to use on the set of data is then selected based, in part, on the compression ratios achieved during the compression tests performed on the sample data.
摘要:
Described herein are compression and processing optimizations by using data transformation techniques. In example embodiments, a byte-wise differential transformation is applied to columnar data represented as a list of length-value pairs to determine a list of delta pairs that is subsequently compressed and stored on persistent storage. A length separation transformation is applied to separate a list of length-value pairs into a length array and a corresponding data value array, where these two arrays are subsequently compressed and stored separately on persistent storage. A native number transformation is applied to a set of number values to remove the lengths stored in the number values, where the transformed set is stored on persistent storage instead of the original set of number values. A native datetime-type transformation is applied to a set of datetime values to generate an encoding that is used to encode the set of datetime values into an encoded set that is stored on persistent storage instead of the original set.
摘要:
A highly flexible and extensible structure is provided for physically storing tabular data. The structure, is referred to as a compression unit, and may be used to physically store tabular data that logically resides in any type of table-like structure. According to one embodiment, compression units are recursive. Thus, a compression unit may have a “parent” compression unit to which it belongs, and may have one or more “child” compression units that belong to it. In one embodiment, compression units include metadata that indicates how the tabular data is stored within them. The metadata for a compression unit may indicate, for example, whether the data within the compression unit is stored in row-major or column major-format (or some combination thereof), the order of the columns within the compression unit (which may differ from the logical order of the columns dictated by the definition of their logical container), a compression technique for the compression unit, the child compression units (if any), etc.