摘要:
Tables in a distributed database can require redistribution, for example to provide improved collocation of tables or table partitions that require joining at a node of multiple nodes across which the distributed database is distributed. Based at least in part on a set of table redistribution parameters, a table redistribution plan can be generated to include redistribution of a table from a first node to a second node. The set of table redistribution parameters can include a grouping parameter indicating at least one other table with which the table should be collocated. The table redistribution plan can be executed to cause the moving of the table from the first node to the second node.
摘要:
A computer-implemented system and method for performing distinct operations on multiple tables of shared memory of parallel computing environments are disclosed. A distinct operation is executed on each table of a plurality of tables, each distinct operation eliminating duplicate data from each table, the executing creating a hierarchy of table pairs and distinct results, the distinct results comprising a reduced row set for each table. Duplicates on each reduced row set are detected to complete the distinct operation on the plurality of tables.
摘要:
Partitioning of a source table of a database to a target table is initiated. Thereafter, a replay table is generated that is populated with triggers for database operations performed on the source table for subsequent replay for the target partitions. Data is later moved (e.g., asynchronously moved, etc.) from the source table to the target table. The database operations are replayed on the target table T subsequent to the moving of the data using the replay table. In addition, the source table is dropped when all of the data has been moved to the target table and there are no operations requiring replay. Related apparatus, systems, techniques and articles are also described.
摘要:
An insertion of a record into a table that includes a primary key column and a second column that includes a global uniqueness constraint across all of a plurality of data partitions across which the table is split is initiated without checking that a value of the record in the second column is globally unique by contacting other partitions the one partition to which the record is to be added to. The insertion can be processed, at least in part by implementing a write lock on the one partition but without implementing a read lock on the other partitions. The write lock on the one partition can be released after the insertion is completed, after which the validity of the insertion can be verified, for example by examining the other parts and a delta partition corresponding to the table. The insertion can be undone if the insertion was not valid.
摘要:
A dynamic split node defined within a calculation model can receive data being operated on by a calculation plan generated based on the calculation model. A partition specification can be applied to one or more reference columns in a table containing at least some of the received data. The applying can cause the table to be split such that a plurality of records in the table are partitioned according to the partition specification. A separate processing path can be set for each partition, and execution of the calculation plan can continue using the separate processing paths, each of which can be assigned to a processing node of a plurality of available processing nodes.
摘要:
A table creation request pertaining to a table in a database maintained on a multi-node data partitioning landscape that comprises a plurality of processing nodes can specify a number of partitions to be generated. At run time, a currently available number of processing nodes in the multi-node data partitioning landscape can be queried, and this currently available number of processing nodes can be compared with the specified number of partitions to be generated for the created table. The table can be generated with the specified number of partitions such that the generated partitions are located across the plurality of partitions according to a load balancing approach if the number of processing nodes equals the number of partitions to be generated or according to other information in the table request if the number of processing nodes does not equal the specified number of partitions.
摘要:
A dynamic split node defined within a calculation model can receive data being operated on by a calculation plan generated based on the calculation model. A partition specification can be applied to one or more reference columns in a table containing at least some of the received data. The applying can cause the table to be split such that a plurality of records in the table are partitioned according to the partition specification. A separate processing path can be set for each partition, and execution of the calculation plan can continue using the separate processing paths, each of which can be assigned to a processing node of a plurality of available processing nodes.
摘要:
Data replication in a database includes identifying a source database system. The source database includes a main index file and a delta log file. To create a replica, one or more symbolic links to the source database system are generated. The symbolic links identify a path to a physical location of the source database. A replica of the source database is generated based on the symbolic links. The replica includes a copy of the main index file and delta log file. Information associated with the replica and the symbolic links is stored in a recovery log. Replica are provided transparently to most database engine components by re-using partitioning infrastructure. Components “see” replica as tables with a single partition; that partition is a local replica.
摘要:
A table creation request pertaining to a table in a database maintained on a multi-node data partitioning landscape that comprises a plurality of processing nodes can specify a number of partitions to be generated. At run time, a currently available number of processing nodes in the multi-node data partitioning landscape can be queried, and this currently available number of processing nodes can be compared with the specified number of partitions to be generated for the created table. The table can be generated with the specified number of partitions such that the generated partitions are located across the plurality of partitions according to a load balancing approach if the number of processing nodes equals the number of partitions to be generated or according to other information in the table request if the number of processing nodes does not equal the specified number of partitions.
摘要:
A computer-implemented system and method for performing distinct operations on multiple tables of shared memory of parallel computing environments are disclosed. A distinct operation is executed on each table of a plurality of tables, each distinct operation eliminating duplicate data from each table, the executing creating a hierarchy of table pairs and distinct results, the distinct results comprising a reduced row set for each table. Duplicates on each reduced row set are detected to complete the distinct operation on the plurality of tables.