Dynamic data partitioning for optimal resource utilization in a parallel data processing system
摘要:
A method, computer program product, and system for dynamically distributing data for parallel processing in a computing system, comprising allocating a data buffer to each of a plurality of data partitions, where each data buffer stores data to be processed by its corresponding data partition, distributing data in multiple rounds to the data buffers for processing by the data partitions, where in each round the data is distributed based on a determined data processing capacity for each data partition, and where a greater amount of data is distributed to the data partitions with higher determined processing capacities, and periodically monitoring usage of each data buffer and re-determining the determined data processing capacity of each data partition based on its corresponding data buffer usage.
信息查询
0/0