-
公开(公告)号:US11544244B2
公开(公告)日:2023-01-03
申请号:US17654296
申请日:2022-03-10
Applicant: Snowflake Inc.
Inventor: Jiaqi Yan , Thierry Cruanes , Jeffrey Rosen , William Waddington , Prasanna Rajaperumal , Abdul Munir
Abstract: Disclosed herein are embodiments of systems and methods for selecting partitions for reclustering based on distribution of overlapping partitions. In an example, a database platform makes a determination to at least partially recluster a database table that includes data stored across a plurality of partitions. The database platform responsively selects a subset of the partitions. The selecting of the subset includes identifying a point on a domain of a clustering key that corresponds to a local maximum of overlapping partitions, and also includes selecting the subset from among a group of overlapping partitions. The group includes at least one partition that overlaps the identified point on the domain of the clustering key. Each partition in the selected subset is above a reduction goal of overlapping partitions. The database platform at least partially reclusters the selected subset based on the clustering key.
-
公开(公告)号:US11379492B2
公开(公告)日:2022-07-05
申请号:US17477663
申请日:2021-09-17
Applicant: Snowflake Inc.
Inventor: Jeffrey Rosen , Abdul Munir , Jiaqi Yan , William Waddington , Prasanna Rajaperumal , Thierry Cruanes
IPC: G06F16/2458 , G06F16/2453 , G06F9/50 , G06F16/2455
Abstract: Resource provisioning systems and methods are described. In an embodiment, a system includes a plurality of shared storage devices collectively storing database data, an execution platform, and a compute service manager. The compute service manager is configured to determine a task to be executed in response to a trigger event and determine a query plan for executing the task, wherein the query plan comprises a plurality of discrete subtasks. The compute service manager is further configured to assign the plurality of discrete subtasks to one or more nodes of a plurality of nodes of the execution platform, determine whether execution of the task is complete, and in response to determining the execution of the task is complete, store a record in the plurality of shared storage devices indicating the task was completed.
-
公开(公告)号:US20200026695A1
公开(公告)日:2020-01-23
申请号:US16514877
申请日:2019-07-17
Applicant: Snowflake Inc.
Inventor: Jiaqi Yan , Thierry Cruanes , Jeffrey Rosen , William Waddington , Prasanna Rajaperumal , Abdul Munir
Abstract: Automatic clustering of a database table is disclosed. A method for automatic clustering of a database table includes receiving an indication that a data modification task has been executed on a table and determining whether the table is sufficiently clustered. The method includes, in response to determining the table is not sufficiently clustered, selecting one or more micro-partitions of the table to be reclustered. The method includes assigning each of the one or more micro-partitions to an execution node to be reclustered.
-
公开(公告)号:US20230071465A1
公开(公告)日:2023-03-09
申请号:US18050255
申请日:2022-10-27
Applicant: Snowflake Inc.
Inventor: Jeffrey Rosen , Abdul Munir , Jiaqi Yan , William Waddington , Prasanna Rajaperumal , Thierry Cruanes
IPC: G06F16/2458 , G06F16/2453 , G06F9/50 , G06F16/2455
Abstract: Resource provisioning systems and methods are described. In an embodiment, a system includes a plurality of shared storage devices collectively storing database data, an execution platform, and a compute service manager. The compute service manager is configured to determine a task to be executed in response to a trigger event and determine a query plan for executing the task, wherein the query plan comprises a plurality of discrete subtasks. The compute service manager is further configured to assign the plurality of discrete subtasks to one or more nodes of a plurality of nodes of the execution platform, determine whether execution of the task is complete, and in response to determining the execution of the task is complete, store a record in the plurality of shared storage devices indicating the task was completed.
-
公开(公告)号:US11442917B2
公开(公告)日:2022-09-13
申请号:US17243795
申请日:2021-04-29
Applicant: Snowflake Inc.
Inventor: Jiaqi Yan , Thierry Cruanes , Jeffrey Rosen , William Waddington , Prasanna Rajaperumal , Abdul Munir
Abstract: Disclosed herein are systems and methods for incremental reclustering of database tables based on local maxima of partition overlap. In an embodiment, a database platform makes a determination, based on one or more incremental-reclustering criteria, to incrementally recluster a database table, which has a clustering key and which is stored across a plurality of partitions. In response to making the determination, the database platform selects a subset of the partitions, and at least incrementally reclusters the selected subset. The selecting of the subset includes identifying a local maximum of a quantity of overlapping partitions in the plurality of partitions with respect to a domain of the clustering key of the table, where the overlapping partitions overlap with respect to the clustering key.
-
公开(公告)号:US20220067016A1
公开(公告)日:2022-03-03
申请号:US17511064
申请日:2021-10-26
Applicant: Snowflake Inc.
Inventor: Jiaqi Yan , Thierry Cruanes , Jeffrey Rosen , William Waddington , Prasanna Rajaperumal , Abdul Munir
Abstract: The subject technology determines whether a table is sufficiently clustered. The subject technology in response to determining the table is not sufficiently clustered, selects one or more micro-partitions of the table to be reclustered. The subject technology constructs a data structure for the table. The subject technology extracts minimum and maximum endpoints for each micro-partition in the data structure. The subject technology sorts each of one or more peaks in the data structure based on height. The subject technology sorts overlapping micro-partitions based on width. The subject technology selects based on which micro-partitions are within the tallest peaks of the one or more peaks and further based on which of the overlapping micro-partitions have the widest widths.
-
公开(公告)号:US20210397615A1
公开(公告)日:2021-12-23
申请号:US17462699
申请日:2021-08-31
Applicant: Snowflake Inc.
Inventor: Jeffrey Rosen , Abdul Munir , Jiaqi Yan , William Waddington , Prasanna Rajaperumal , Thierry Cruanes
IPC: G06F16/2458 , G06F16/2453 , G06F9/50 , G06F16/2455
Abstract: Resource provisioning systems and methods are described. In an embodiment, a system includes a plurality of shared storage devices collectively storing database data, an execution platform, and a compute service manager. The compute service manager is configured to determine a task to be executed in response to a trigger event and determine a query plan for executing the task, wherein the query plan comprises a plurality of discrete subtasks. The compute service manager is further configured to assign the plurality of discrete subtasks to one or more nodes of a plurality of nodes of the execution platform, determine whether execution of the task is complete, and in response to determining the execution of the task is complete, store a record in the plurality of shared storage devices indicating the task was completed.
-
公开(公告)号:US11138214B2
公开(公告)日:2021-10-05
申请号:US16778954
申请日:2020-01-31
Applicant: Snowflake Inc.
Inventor: Jeffrey Rosen , Abdul Munir , Jiaqi Yan , William Waddington , Prasanna Rajaperumal , Thierry Cruanes
IPC: G06F16/2458 , G06F16/2453 , G06F9/50 , G06F16/2455
Abstract: Resource provisioning systems and methods are described. In an embodiment, a system includes a plurality of shared storage devices collectively storing database data, an execution platform, and a compute service manager. The compute service manager is configured to determine a task to be executed in response to a trigger event and determine a query plan for executing the task, wherein the query plan comprises a plurality of discrete subtasks. The compute service manager is further configured to assign the plurality of discrete subtasks to one or more nodes of a plurality of nodes of the execution platform, determine whether execution of the task is complete, and in response to determining the execution of the task is complete, store a record in the plurality of shared storage devices indicating the task was completed.
-
公开(公告)号:US11514064B2
公开(公告)日:2022-11-29
申请号:US17663248
申请日:2022-05-13
Applicant: Snowflake Inc.
Inventor: Jeffrey Rosen , Abdul Munir , Jiaqi Yan , William Waddington , Prasanna Rajaperumal , Thierry Cruanes
IPC: G06F16/24 , G06F16/2458 , G06F16/2453 , G06F16/2455 , G06F9/50
Abstract: Resource provisioning systems and methods are described. In an embodiment, a system includes a plurality of shared storage devices collectively storing database data, an execution platform, and a compute service manager. The compute service manager is configured to determine a task to be executed in response to a trigger event and determine a query plan for executing the task, wherein the query plan comprises a plurality of discrete subtasks. The compute service manager is further configured to assign the plurality of discrete subtasks to one or more nodes of a plurality of nodes of the execution platform, determine whether execution of the task is complete, and in response to determining the execution of the task is complete, store a record in the plurality of shared storage devices indicating the task was completed.
-
公开(公告)号:US20220269676A1
公开(公告)日:2022-08-25
申请号:US17663248
申请日:2022-05-13
Applicant: Snowflake Inc.
Inventor: Jeffrey Rosen , Abdul Munir , Jiaqi Yan , William Waddington , Prasanna Rajaperumal , Thierry Cruanes
IPC: G06F16/2458 , G06F16/2453 , G06F9/50 , G06F16/2455
Abstract: Resource provisioning systems and methods are described. In an embodiment, a system includes a plurality of shared storage devices collectively storing database data, an execution platform, and a compute service manager. The compute service manager is configured to determine a task to be executed in response to a trigger event and determine a query plan for executing the task, wherein the query plan comprises a plurality of discrete subtasks. The compute service manager is further configured to assign the plurality of discrete subtasks to one or more nodes of a plurality of nodes of the execution platform, determine whether execution of the task is complete, and in response to determining the execution of the task is complete, store a record in the plurality of shared storage devices indicating the task was completed.
-
-
-
-
-
-
-
-
-