-
1.
公开(公告)号:US20160253402A1
公开(公告)日:2016-09-01
申请号:US14634199
申请日:2015-02-27
Applicant: Oracle International Corporation
Inventor: Boris Klots , Vikas Aggarwal , Nipun Agarwal , John Kowtko , Felix Schmidt , Kantikiran Pasuppuleti
IPC: G06F17/30
CPC classification number: G06F17/30584
Abstract: A method and apparatus for adaptive data repartitioning and adaptive data replication is provided. A data set stored in a distributed data processing system is partitioned by a first partitioning key. A live workload comprising a plurality of data processing commands is processed. While processing the live workload, statistical properties of the live workload are maintained. Based on the statistical properties of the live workload with respect to the data set, it is determined to replicate and/or repartition the data set by a second partitioning key. The replicated and/or repartitioned data set is partitioned by the second partitioning key.
Abstract translation: 提供了一种用于自适应数据重新分配和自适应数据复制的方法和装置。 存储在分布式数据处理系统中的数据集由第一分区键划分。 处理包括多个数据处理命令的实时工作。 在处理实时工作负载时,维持实时工作负载的统计属性。 基于相对于数据集的实时工作负载的统计特性,确定通过第二分区密钥复制和/或重新分配数据集。 复制和/或重新分区的数据集由第二分区密钥分隔。
-
公开(公告)号:US09813490B2
公开(公告)日:2017-11-07
申请号:US14711617
申请日:2015-05-13
Applicant: Oracle International Corporation
Inventor: Sam Idicula , Aarti Basant , Vikas Aggarwal , Stephan Wolf , Nipun Agarwal
IPC: G06F15/16 , H04L29/08 , H04L12/40 , G06F17/30 , H04L12/911
CPC classification number: H04L67/10 , G06F17/30545 , G06F17/30584 , H04L12/40143 , H04L47/828
Abstract: A method, apparatus, and system for efficiently re-partitioning data using scheduled network communication are provided. Given re-partitioning data defining the data blocks to be sent amongst a plurality of server nodes, a corresponding network schedule is determined to send the data blocks in a coordinated manner. The network schedule is divided into time slots, wherein each of the plurality of server nodes can send up to one data block and receive up to one data block in each time slot. By using a greedy selection algorithm that prioritizes by largest senders and largest receivers, a near optimal schedule can be determined even in the presence of heavy skew. The greedy selection algorithm can be implemented with a O(T*N^2) time complexity, enabling scaling to large multi-node clusters with many server nodes. The network schedule is of particular interest for database execution plans requiring re-partitioning on operators with different keys.
-
公开(公告)号:US10223437B2
公开(公告)日:2019-03-05
申请号:US14634199
申请日:2015-02-27
Applicant: Oracle International Corporation
Inventor: Boris Klots , Vikas Aggarwal , Nipun Agarwal , John Kowtko , Felix Schmidt , Kantikiran Pasupuleti
Abstract: A method and apparatus for adaptive data repartitioning and adaptive data replication is provided. A data set stored in a distributed data processing system is partitioned by a first partitioning key. A live workload comprising a plurality of data processing commands is processed. While processing the live workload, statistical properties of the live workload are maintained. Based on the statistical properties of the live workload with respect to the data set, it is determined to replicate and/or repartition the data set by a second partitioning key. The replicated and/or repartitioned data set is partitioned by the second partitioning key.
-
公开(公告)号:US10263893B2
公开(公告)日:2019-04-16
申请号:US15372224
申请日:2016-12-07
Applicant: Oracle International Corporation
Inventor: Vikas Aggarwal , Ankur Arora , Sam Idicula , Nipun Agarwal
IPC: H04L12/801 , H04L29/08 , H04L12/805
Abstract: Techniques are provided for using decentralized lock synchronization to increase network throughput. In an embodiment, a first computer sends, to a second computer comprising a lock, a request to acquire the lock. In response to receiving the lock acquisition request, the second computer detects whether the lock is available. If the lock is unavailable, then the second computer replies by sending a denial to the first computer. Otherwise, the second computer sends an exclusive grant of the lock to the first computer. While the first computer has acquired the lock, the first computer sends data to the second computer. Afterwards, the first computer sends a request to release the lock to the second computer. This completes one duty cycle of the lock, and the lock is again available for acquisition.
-
公开(公告)号:US10146806B2
公开(公告)日:2018-12-04
申请号:US14621204
申请日:2015-02-12
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Nathan Pemberton , Vikas Aggarwal , Sam Idicula , Nipun Agarwal
IPC: G06F17/30
Abstract: A method, apparatus, and system for determining a data distribution is provided by using an adaptive resolution histogram. In an embodiment, the adaptive resolution histogram is created using a trie, wherein node values in the trie represent frequency distributions and node positions define associated keys or key prefixes. Keys are derived from input data such as database records that are streamed from a record source. These keys may be processed as received to build the trie in parallel with the production of the input data. To provide adaptive resolution, new child nodes may only be created in the trie when a node value is incremented beyond a predetermined threshold. In this manner, the histogram adjusts the allocation of nodes according to the actual distribution of the data. The completed adaptive resolution histogram may be used for various tasks such as partitioning for balanced parallel processing of the input data.
-
6.
公开(公告)号:US20180159774A1
公开(公告)日:2018-06-07
申请号:US15372224
申请日:2016-12-07
Applicant: Oracle International Corporation
Inventor: Vikas Aggarwal , Ankur Arora , Sam Idicula , Nipun Agarwal
IPC: H04L12/801 , H04L12/805 , H04L29/08
CPC classification number: H04L47/12 , H04L47/365 , H04L67/32
Abstract: Techniques are provided for using decentralized lock synchronization to increase network throughput. In an embodiment, a first computer sends, to a second computer comprising a lock, a request to acquire the lock. In response to receiving the lock acquisition request, the second computer detects whether the lock is available. If the lock is unavailable, then the second computer replies by sending a denial to the first computer. Otherwise, the second computer sends an exclusive grant of the lock to the first computer. While the first computer has acquired the lock, the first computer sends data to the second computer. Afterwards, the first computer sends a request to release the lock to the second computer. This completes one duty cycle of the lock, and the lock is again available for acquisition.
-
-
-
-
-