-
公开(公告)号:US20230195729A1
公开(公告)日:2023-06-22
申请号:US17804770
申请日:2022-05-31
Applicant: Snowflake Inc.
Inventor: Sebastian Breß , Moritz Eyssen , Max Heimel
IPC: G06F16/2453
CPC classification number: G06F16/24545 , G06F16/24537 , G06F16/24532
Abstract: Various embodiments provide for executing sub-plans in parallel using a plurality of execution nodes, which can be part of a data platform. In particular, various embodiments identify sub-plans (e.g., fragments or portions of one or more child operators) of a root operator in a query plan such that the identified sub-plans that are candidates for execution on a single execution node, determine a cost estimate for causing the candidate sub-plans to be executed in parallel using multiple execution nodes, and cause the candidate sub-plans to be executed in parallel based on the cost estimate.
-
公开(公告)号:US11379480B1
公开(公告)日:2022-07-05
申请号:US17647629
申请日:2022-01-11
Applicant: Snowflake Inc.
Inventor: Sebastian Breß , Moritz Eyssen , Max Heimel
IPC: G06F16/245 , G06F16/2453
Abstract: Sub-plans are executed in parallel using a plurality of execution nodes, which can be part of a data platform. In particular, sub-plans (e.g., fragments or portions of one or more child operators) of a root operator are identified in a query plan such that the identified sub-plans that are candidates for execution on a single execution node, determine a cost estimate for causing the candidate sub-plans to be executed in parallel using multiple execution nodes, and cause the candidate sub-plans to be executed in parallel based on the cost estimate.
-
公开(公告)号:US11188563B2
公开(公告)日:2021-11-30
申请号:US17237340
申请日:2021-04-22
Applicant: Snowflake Inc.
Inventor: Sebastian Breß , Moritz Eyssen , Max Heimel
IPC: G06F16/00 , G06F16/27 , G06F16/2455
Abstract: A global and local row count limit associated with a limit query are received by a stop operator of a first execution node among a set of execution nodes that are assigned to process the limit query. Local distributed row count data is generated based on a local row count corresponding to a number of rows output by the first execution node in processing the query. Based on determining the local row count satisfies the local limit, the first execution node buffers rows produced in processing the query. The local distributed row count data is updated based on remote distributed row count data received from a second execution node. A stopping condition is detected based on determining the global limit is satisfied based on updated local distributed row count data and query processing by the first execution node based on detecting the stopping condition.
-
公开(公告)号:US12153602B2
公开(公告)日:2024-11-26
申请号:US17815389
申请日:2022-07-27
Applicant: Snowflake Inc.
Inventor: Sebastian Breß , Moritz Eyssen , Max Heimel
IPC: G06F16/00 , G06F16/2455 , G06F16/27
Abstract: A global and local row count limit associated with a limit query are received by a stop operator of a first execution node among a set of execution nodes that are assigned to process the limit query. Local distributed row count data is generated based on a local row count corresponding to a number of rows output by the first execution node in processing the query. Based on determining the local row count satisfies the local limit, the first execution node buffers rows produced in processing the query. The local distributed row count data is updated based on remote distributed row count data received from a second execution node. A stopping condition is detected based on determining the global limit is satisfied based on updated local distributed row count data and query processing by the first execution node based on detecting the stopping condition.
-
公开(公告)号:US20220382782A1
公开(公告)日:2022-12-01
申请号:US17815389
申请日:2022-07-27
Applicant: Snowflake Inc.
Inventor: Sebastian Breß , Moritz Eyssen , Max Heimel
IPC: G06F16/27 , G06F16/2455
Abstract: A global and local row count limit associated with a limit query are received by a stop operator of a first execution node among a set of execution nodes that are assigned to process the limit query. Local distributed row count data is generated based on a local row count corresponding to a number of rows output by the first execution node in processing the query. Based on determining the local row count satisfies the local limit, the first execution node buffers rows produced in processing the query. The local distributed row count data is updated based on remote distributed row count data received from a second execution node. A stopping condition is detected based on determining the global limit is satisfied based on updated local distributed row count data and query processing by the first execution node based on detecting the stopping condition.
-
公开(公告)号:US11436253B2
公开(公告)日:2022-09-06
申请号:US17517935
申请日:2021-11-03
Applicant: Snowflake Inc.
Inventor: Sebastian Breß , Moritz Eyssen , Max Heimel
IPC: G06F16/00 , G06F16/27 , G06F16/2455
Abstract: A global and local row count limit associated with a limit query are received by a stop operator of a first execution node among a set of execution nodes that are assigned to process the limit query. Local distributed row count data is generated based on a local row count corresponding to a number of rows output by the first execution node in processing the query. Based on determining the local row count satisfies the local limit, the first execution node buffers rows produced in processing the query. The local distributed row count data is updated based on remote distributed row count data received from a second execution node. A stopping condition is detected based on determining the global limit is satisfied based on updated local distributed row count data and query processing by the first execution node based on detecting the stopping condition.
-
公开(公告)号:US20220058206A1
公开(公告)日:2022-02-24
申请号:US17517935
申请日:2021-11-03
Applicant: Snowflake Inc.
Inventor: Sebastian Breß , Moritz Eyssen , Max Heimel
IPC: G06F16/27 , G06F16/2455
Abstract: A global and local row count limit associated with a limit query are received by a stop operator of a first execution node among a set of execution nodes that are assigned to process the limit query. Local distributed row count data is generated based on a local row count corresponding to a number of rows output by the first execution node in processing the query. Based on determining the local row count satisfies the local limit, the first execution node buffers rows produced in processing the query. The local distributed row count data is updated based on remote distributed row count data received from a second execution node. A stopping condition is detected based on determining the global limit is satisfied based on updated local distributed row count data and query processing by the first execution node based on detecting the stopping condition.
-
公开(公告)号:US20210303593A1
公开(公告)日:2021-09-30
申请号:US17237340
申请日:2021-04-22
Applicant: Snowflake Inc.
Inventor: Sebastian Breß , Moritz Eyssen , Max Heimel
IPC: G06F16/27 , G06F16/2455
Abstract: A global and local row count limit associated with a limit query are received by a stop operator of a first execution node among a set of execution nodes that are assigned to process the limit query. Local distributed row count data is generated based on a local row count corresponding to a number of rows output by the first execution node in processing the query. Based on determining the local row count satisfies the local limit, the first execution node buffers rows produced in processing the query. The local distributed row count data is updated based on remote distributed row count data received from a second execution node. A stopping condition is detected based on determining the global limit is satisfied based on updated local distributed row count data and query processing by the first execution node based on detecting the stopping condition.
-
公开(公告)号:US12235833B2
公开(公告)日:2025-02-25
申请号:US18415826
申请日:2024-01-18
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Moritz Eyssen , Max Heimel , Lishi Jiang , Alexander Miller
IPC: G06F16/23 , G06F9/46 , G06F9/52 , G06F16/2455
Abstract: The subject technology receives, at a first execution node, a first transaction, the first transaction to be executed on linearizable storage. The subject technology determines whether the first execution node corresponds to a rank indicating a leader worker. The subject technology, in response to the first execution node corresponding to the rank indicating the leader worker, performs, by the first execution node, an initialization process for executing the first transaction. The subject technology broadcasts a first read timestamp associated with the first transaction to a set of execution nodes, the set of execution nodes being different than the first execution node. The subject technology executes, by the first execution node, at least a first operation from the first transaction.
-
10.
公开(公告)号:US20240419663A1
公开(公告)日:2024-12-19
申请号:US18819649
申请日:2024-08-29
Applicant: Snowflake Inc.
Inventor: Xinzhu Cai , Bowei Chen , Bjoern Daase , Moritz Eyssen , Florian Andreas Funke
IPC: G06F16/2453 , G06F16/22
Abstract: Provided herein are systems, methods, and computer-storage media for managing data skew in hash join operations. A skew manager partitions build-side row data into multiple sets corresponding to hash-join-build (HJB) instances based on hash values. The skew manager detects skew in a build-side row set associated with a first HJB instance by analyzing the number of rows. Upon detecting skew, the skew manager redirects data rows to at least a second HJB instance. The method involves configuring skew caches, generating histograms, and detecting frequent hash values to identify skew. It also includes communicating skew notifications, broadcasting probe-side row data, and adjusting partitioning of probe-side data. The disclosed techniques further include buffering build-side row sets in streams and performing join operations based on these streams, enhancing efficiency in distributed computing environments.
-
-
-
-
-
-
-
-
-