-
公开(公告)号:US10146806B2
公开(公告)日:2018-12-04
申请号:US14621204
申请日:2015-02-12
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Nathan Pemberton , Vikas Aggarwal , Sam Idicula , Nipun Agarwal
IPC: G06F17/30
Abstract: A method, apparatus, and system for determining a data distribution is provided by using an adaptive resolution histogram. In an embodiment, the adaptive resolution histogram is created using a trie, wherein node values in the trie represent frequency distributions and node positions define associated keys or key prefixes. Keys are derived from input data such as database records that are streamed from a record source. These keys may be processed as received to build the trie in parallel with the production of the input data. To provide adaptive resolution, new child nodes may only be created in the trie when a node value is incremented beyond a predetermined threshold. In this manner, the histogram adjusts the allocation of nodes according to the actual distribution of the data. The completed adaptive resolution histogram may be used for various tasks such as partitioning for balanced parallel processing of the input data.