Abstract:
The systems, methods, devices, and non-transitory media of the various embodiments enable query execution plan graphs to be compared to determine whether all or portions of two or more queries define data sets that are structurally equivalent. Two data sets may be structurally equivalent when each data set may be composed with a bijective relation that yields the other. In the various embodiments, when all or a portion of a first query that has been previously run defines a data set that is structurally equivalent to a data set defined by all or a portion of a second query that is to be run, the structure preserving transform may be applied to the corresponding portion of the second query to transform that portion of the second query into the corresponding portion of the first query, thereby allowing the results from previously running the first query to be reused.
Abstract:
A data management device (10) is provided, comprising a control module (100) and a storage module (1 10), wherein the storage module (1 10) is configured to store a plurality of data sets in a plurality of data set groups such that the plurality of data sets is assigned to the plurality of data set groups such that every data set group comprises at least one data set and every data set is stored in one data set group only and wherein the control module (100) is configured to assign an exclusive execution context to each data set group and to estimate a number of data set requests for every data set. The control module (100) is further configured to determine data sets for every data set group based on the estimated number of data set requests and to reassign the data sets to the data set groups such that the estimated number of data set requests in a data set group is less or equal to a predetermined number of data set requests for at least one exclusive execution context which is assigned to one of the plurality of data set groups. Thus, skew workload of the data set groups and the assigned exclusive execution contexts can be avoided.
Abstract:
A method for copying values of a table of a database between a primary memory and a secondary memory is provided, wherein the table is organized in a plurality of stripes and a plurality of vertical partitions, wherein a stripe comprises at least two rows of the table and a vertical partition comprises one or more columns of the table, wherein the table is stored as a plurality of segments, wherein a segment comprises values at a cross-section of a stripe and a vertical partition, and wherein a segment stores adjacent column values in adjacent locations of the primary or the secondary memory, the method comprising a step of selecting one or more segments and copying the one or more selected segments between the primary memory and the secondary memory.
Abstract:
A parent record is created, and the parent record includes a cache for children. Child records are created, and each child record belongs to a parent. Responsive to the creation or update of a child record, the parent record's cache is invalidated. To rebuild the parent record's cache, the child records are serialized and written into the parent record's cache. During a read operation, the parent record is read, including the parent record's cache of children, in a single database access. This results in a substantial savings of time as compared to retrieving the parent and the children from the database separately. Where the number of reads of the parent record greatly exceeds the number of changes to child records, serialized child associations in parent records enhances the efficiency of database access.
Abstract:
A consistent user view system. The system incorporates any changes made by a user in any views shown to that user even when the changes have not propagated to the partitions supplying the view. The system separates the authority for edits from the replicated storage allowing efficient transactions and linear scalability. Documents are read from view-based partitions of a store. Document writes are written to a document-specific partition in a journal and applied to the store. The system stores a copy of pending changes in a user-specific partition. When a user requests a view, the system checks that user's cache for any pending changes applicable to the view. If any applicable changes are found, the changes are applied before showing the view to the user. Pending changes that have been successfully applied to the store are trimmed from the user-specific partition to free up resources.
Abstract:
A system, method, and non-transitory computer readable medium for retrieving data in response to an input query is provided. The system includes a first data server configured to generate a query based on the input query and a second data server configured to generate a another query based on the input query. The system also includes a master server in communication with the data servers. The method involves generating queries based on the input query and sending the queries to data sources. The method also involves determining if data records represent the same external entity and combining data records to store. The non-transitory computer readable medium is encoded with codes to direct a processor to carry out the method.
Abstract:
Example data management systems and methods are described. In one implementation, a method identifies multiple files to process based on a received query and identifies multiple execution nodes available to process the multiple files. The method initially creates multiple scansets, each including a portion of the multiple files, and assigns each scanset to one of the execution nodes based on a file assignment model. The multiple scansets are processed by the multiple execution nodes. If the method determines that a particular execution node has finished processing all files in its assigned scanset, an unprocessed file is reassigned from another execution node to the particular execution node.