Adaptive data repartitioning and adaptive data replication
Abstract:
A method and apparatus for adaptive data repartitioning and adaptive data replication is provided. A data set stored in a distributed data processing system is partitioned by a first partitioning key. A live workload comprising a plurality of data processing commands is processed. While processing the live workload, statistical properties of the live workload are maintained. Based on the statistical properties of the live workload with respect to the data set, it is determined to replicate and/or repartition the data set by a second partitioning key. The replicated and/or repartitioned data set is partitioned by the second partitioning key.
Public/Granted literature
Information query
Patent Agency Ranking
0/0