Abstract:
A method that comprises receiving a logical execution plan for a database query corresponding to a plurality of tables of the database, wherein the logical execution plan comprises one or more operators, receiving an operator cost for each of the operators in the logical execution plan, computing a first accumulated processing cost for a first of the tables based on the logical execution plan, operator selectivity, and operator costs corresponding to the first table, computing a second accumulated processing cost for a second of the tables based on the logical execution plan, operator selectivity, and operator costs corresponding to the second table, comparing the first accumulated processing cost and the second accumulated processing cost to determine a table with the highest accumulated processing cost, and responsive to comparing the accumulated processing costs, computing a physical execution plan that requires partitioning the table with the highest accumulated processing cost.
Abstract:
A method for dynamically building a column store database from a row store database. The method includes establishing the row store database for storing data, wherein each row includes a plurality of attributes, and wherein data in row store database is current to a temporal point in time. The method includes establishing the column store database including data structured to satisfy received analytic queries. The method includes beginning from an initial state of the column store database, for each subsequently received analytic query, importing a targeted amount of data from a corresponding temporal state of the row store database into the column store database to satisfy the corresponding subsequently received analytic query.
Abstract:
A method of dynamically computing an optimal materialization schedule for each column in a column oriented RDBMS. Dynamic column-specific materialization scheduling in a distributed column oriented RDBMS is optimized by choosing a materialization strategy based on execution cost including central processing unit (CPU), disk, and network costs for each individual exchange operator. The dynamic programming approach is computationally feasible because the optimal schedule for a sub-plan is path independent.
Abstract:
A request to perform a transaction on a database in an online transaction processing system is accessed by a node. The sets of data in the database that the transaction is to act on are determined. The transaction is then separated into actions according to the data dependencies of the actions; an action is established for each set of data that is acted on by the transaction. The actions are communicated to the nodes that store the data that the respective actions depend on. The actions are then performed on the nodes to which they were routed.
Abstract:
A method of dynamically computing an optimal materialization schedule for each column in a column oriented RDBMS. Dynamic column-specific materialization scheduling in a distributed column oriented RDBMS is optimized by choosing a materialization strategy based on execution cost including central processing unit (CPU), disk, and network costs for each individual exchange operator. The dynamic programming approach is computationally feasible because the optimal schedule for a sub-plan is path independent.
Abstract:
Various disclosed embodiments include methods and systems for managing lock or latch chains in concurrent execution of database queries. A method includes receiving a plurality of transactions, each transaction associated with one or more queuing requests. The method includes, for each transaction, determining one or more partition sets. Each partition set corresponds to one or more database partitions needed for the transaction. The one or more database partitions are included within a partitioned database. The method includes, for each transaction, determining one or more queues needed for the transaction and storing a bitmap representation of the one or more queues needed for the transaction. The one or more queues needed for the transaction correspond to the one or more database partitions needed for the transaction.
Abstract:
Various disclosed embodiments include methods and systems for managing lock or latch chains in concurrent execution of database queries. A method includes receiving a plurality of transactions, each transaction associated with one or more queuing requests. The method includes, for each transaction, determining one or more partition sets. Each partition set corresponds to one or more database partitions needed for the transaction. The one or more database partitions are included within a partitioned database. The method includes, for each transaction, determining one or more queues needed for the transaction and storing a bitmap representation of the one or more queues needed for the transaction. The one or more queues needed for the transaction correspond to the one or more database partitions needed for the transaction.
Abstract:
A method for updating a column store database and includes establishing a row store database, wherein each row comprises a plurality of attributes. The method includes establishing a column store database including attribute vectors corresponding to at least one attribute in the row store, wherein each attribute vector includes data used to satisfy at least one of previously received analytic queries. The method includes collecting a SQL change statements beginning from a synchronization point indicating when the row store database and the column store database are synchronized, and continuing until an analytic query is received. The method includes sending the plurality of SQL change statements to the column store database upon receipt of the analytic query for updating the column store database for purposes of satisfying the query, wherein the analytic query is directed to a queried range of primary key attributes in the plurality of attributes.
Abstract:
A method of pipelining re-shuffled data of a distributed column oriented relational database management system (RDBMS). A request is received from a consumer process that requires RDBMS column data to be shuffled in a specific order according to an order that each of a plurality of columns will be used by the consumer process. For each of the plurality of columns, the method re-shuffles the RDBMS column data according to the specific order to form re-shuffled RDBMS column data, and sends the re-shuffled RDBMS column data to the consumer process.
Abstract:
A method that comprises receiving a logical execution plan for a database query corresponding to a plurality of tables of the database, wherein the logical execution plan comprises one or more operators, receiving an operator cost for each of the operators in the logical execution plan, computing a first accumulated processing cost for a first of the tables based on the logical execution plan, operator selectivity, and operator costs corresponding to the first table, computing a second accumulated processing cost for a second of the tables based on the logical execution plan, operator selectivity, and operator costs corresponding to the second table, comparing the first accumulated processing cost and the second accumulated processing cost to determine a table with the highest accumulated processing cost, and responsive to comparing the accumulated processing costs, computing a physical execution plan that requires partitioning the table with the highest accumulated processing cost.