Abstract:
A method that comprises receiving a logical execution plan for a database query corresponding to a plurality of tables of the database, wherein the logical execution plan comprises one or more operators, receiving an operator cost for each of the operators in the logical execution plan, computing a first accumulated processing cost for a first of the tables based on the logical execution plan, operator selectivity, and operator costs corresponding to the first table, computing a second accumulated processing cost for a second of the tables based on the logical execution plan, operator selectivity, and operator costs corresponding to the second table, comparing the first accumulated processing cost and the second accumulated processing cost to determine a table with the highest accumulated processing cost, and responsive to comparing the accumulated processing costs, computing a physical execution plan that requires partitioning the table with the highest accumulated processing cost.
Abstract:
A method of pipelining re-shuffled data of a distributed column oriented relational database management system (RDBMS). A request is received from a consumer process that requires RDBMS column data to be shuffled in a specific order according to an order that each of a plurality of columns will be used by the consumer process. For each of the plurality of columns, the method re-shuffles the RDBMS column data according to the specific order to form re-shuffled RDBMS column data, and sends the re-shuffled RDBMS column data to the consumer process.
Abstract:
A method that comprises receiving a logical execution plan for a database query corresponding to a plurality of tables of the database, wherein the logical execution plan comprises one or more operators, receiving an operator cost for each of the operators in the logical execution plan, computing a first accumulated processing cost for a first of the tables based on the logical execution plan, operator selectivity, and operator costs corresponding to the first table, computing a second accumulated processing cost for a second of the tables based on the logical execution plan, operator selectivity, and operator costs corresponding to the second table, comparing the first accumulated processing cost and the second accumulated processing cost to determine a table with the highest accumulated processing cost, and responsive to comparing the accumulated processing costs, computing a physical execution plan that requires partitioning the table with the highest accumulated processing cost.