-
公开(公告)号:US12229135B2
公开(公告)日:2025-02-18
申请号:US17699607
申请日:2022-03-21
Applicant: Oracle International Corporation
Inventor: Urvashi Oswal , Jian Wen , Farhan Tauheed , Onur Kocberber , Seema Sundara , Nipun Agarwal
IPC: G06F16/2453 , G06F11/34 , G06F16/21 , G06F16/22 , G06F16/27
Abstract: Embodiments implement a prediction-driven, rather than a trial-driven, approach to automatic data placement recommendations for partitioning data across multiple nodes in a database system. The system is configured to extract workload-specific features of a database workload running at a database system and dataset-specific features of a database running on the database system. The workload-specific features characterize utilization of the database workload. The dataset-specific features characterize how data is organized within the database. The system identifies a plurality of candidate keys for determining how to partition data stored in the database across nodes. Based at least in part on the workload-specific features, the dataset specific features, and the plurality of candidate keys, a set of candidate key combinations for partitioning data is generated. Using a machine learning model, determine a particular candidate key combination that optimizes query execution performance benefit based on the workload-specific features and the dataset specific features. Generate data placement commands to allocate the database tables across the nodes.
-
公开(公告)号:US11907250B2
公开(公告)日:2024-02-20
申请号:US17871092
申请日:2022-07-22
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Urvashi Oswal , Marc Jolles , Onur Kocberber , Seema Sundara , Nipun Agarwal
IPC: G06F16/24 , G06F16/25 , G06F16/21 , G06F11/34 , G06F16/2458
CPC classification number: G06F16/258 , G06F11/3409 , G06F16/21 , G06F16/2462
Abstract: Techniques are described for executing machine learning models trained for specific operators with feature values that are based on the actual execution of a workload set. The machine learning models generate an estimate of benefit gain/cost for executing operations on data portions in the alternative encoding format. Such data potions may be sorted based on the estimated benefit, in an embodiment. Using cost estimation machine learning models for memory space, the data portions with the most benefits that comply with the existing memory space constraints are recommended and/or are automatically encoded into the alternative encoding format.
-
公开(公告)号:US20230297573A1
公开(公告)日:2023-09-21
申请号:US17699607
申请日:2022-03-21
Applicant: Oracle International Corporation
Inventor: Urvashi Oswal , Jian Wen , Farhan Tauheed , Onur Kocberber , Seema Sundara , Nipun Agarwal
IPC: G06F16/2453 , G06F16/21 , G06F16/22 , G06F11/34 , G06F16/27
CPC classification number: G06F16/24544 , G06F16/211 , G06F16/2282 , G06F11/3409 , G06F16/278
Abstract: Embodiments implement a prediction-driven, rather than a trial-driven, approach to automatic data placement recommendations for partitioning data across multiple nodes in a database system. The system is configured to extract workload-specific features of a database workload running at a database system and dataset-specific features of a database running on the database system. The workload-specific features characterize utilization of the database workload. The dataset-specific features characterize how data is organized within the database. The system identifies a plurality of candidate keys for determining how to partition data stored in the database across nodes. Based at least in part on the workload-specific features, the dataset specific features, and the plurality of candidate keys, a set of candidate key combinations for partitioning data is generated. Using a machine learning model, determine a particular candidate key combination that optimizes query execution performance benefit based on the workload-specific features and the dataset specific features. Generate data placement commands to allocate the database tables across the nodes.
-
-