-
公开(公告)号:US20220035814A1
公开(公告)日:2022-02-03
申请号:US17502685
申请日:2021-10-15
Applicant: Snowflake Inc.
Inventor: Florian Andreas Funke , Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/2453 , G06F16/22 , G06F16/2455
Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.
-
公开(公告)号:US20210326340A1
公开(公告)日:2021-10-21
申请号:US17327521
申请日:2021-05-21
Applicant: SNOWFLAKE INC.
Inventor: Thierry Cruanes , Florian Andreas Funke , Guangyan Hu , Jiaqi Yan
IPC: G06F16/2453 , G06F16/2455
Abstract: Joining data using a disjunctive operator is described. An example computer-implemented method can include receiving a query that includes a first disjunctive predicate involving a first table and a second table. The method may also include determining a first set of rows from the first table and generating a filter from the first set of rows. The method may also further include applying the filter to the second table to generate a second set of rows. Additionally, the method may also include joining the first set of rows and the second set of rows using a first disjunctive operator of the first disjunctive predicate to generate a first results set.
-
公开(公告)号:US20210286816A1
公开(公告)日:2021-09-16
申请号:US17105406
申请日:2020-11-25
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Florian Andreas Funke , Guangyan Hu , Jiaqi Yan
IPC: G06F16/2453 , G06F16/2455
Abstract: Joining data using a disjunctive operator is described. An example computer-implemented method can include receiving a query that includes a first disjunctive predicate involving a first table and a second table. The method may also include determining a first set of rows from the first table and generating a filter from the first set of rows. The method may also further include applying the filter to the second table to generate a second set of rows. Additionally, the method may also include joining the first set of rows and the second set of rows using a first disjunctive operator of the first disjunctive predicate to generate a first results set.
-
公开(公告)号:US20210089560A1
公开(公告)日:2021-03-25
申请号:US17118201
申请日:2020-12-10
Applicant: SNOWFLAKE INC.
Inventor: Florian Andreas Funke , Peter Povinec , Thierry Cruanes , Benoit Dageville
IPC: G06F16/28 , H04L29/08 , G06F16/2455 , H04L12/24
Abstract: A method for a multi-cluster warehouse includes allocating a plurality of compute clusters as part of a virtual warehouse. The compute clusters are used to access and perform queries against one or more databases in one or more cloud storage resources. The method includes providing queries for the virtual warehouse to each of the plurality of compute clusters. Each of the plurality of compute clusters of the virtual warehouse receives a plurality of queries so that the computing load is spread across the different clusters. The method also includes dynamically adding compute clusters to and removing compute clusters from the virtual warehouse as needed based on a workload of the plurality of compute clusters.
-
公开(公告)号:US12164514B2
公开(公告)日:2024-12-10
申请号:US18109099
申请日:2023-02-13
Applicant: SNOWFLAKE INC.
Inventor: Thierry Cruanes , Florian Andreas Funke , Guangyan Hu , Jiaqi Yan
IPC: G06F16/00 , G06F16/2453 , G06F16/2455
Abstract: Joining data using a disjunctive operator is described. An example computer-implemented method can include generating, with a processing device, a query plan for a query, the query comprising a join operator expression for a disjunctive predicate, wherein the join operator expression includes a conjunctive predicate and a disjunctive operator. The method may further include generating a bloom filter for the disjunctive operator. Additionally, the method may include generating a result set as a result of evaluating the join operator expression using the disjunctive operator and bloom filter for the disjunctive predicate.
-
公开(公告)号:US12056123B2
公开(公告)日:2024-08-06
申请号:US18099866
申请日:2023-01-20
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Florian Andreas Funke , Guangyan Hu , Jiaqi Yan
IPC: G06F7/00 , G06F16/00 , G06F16/22 , G06F16/2453 , G06F16/2455
CPC classification number: G06F16/24537 , G06F16/2255 , G06F16/24556
Abstract: Joining data using a disjunctive operator using a lookup table is described. An example computer-implemented method can include receiving a query with a set of conjunctive predicates and a set of disjunctive predicates. The method may also include generating a lookup table for each predicate in the sets of conjunctive predicates and disjunctive predicates. The method, for each row in a probe-side table, may also further include looking up a value associated with that row in each of the lookup tables and adding the row to a results set when there is a match. Additionally, the method may also include returning the results set.
-
公开(公告)号:US11983198B2
公开(公告)日:2024-05-14
申请号:US18139809
申请日:2023-04-26
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Benoit Dageville , Florian Andreas Funke , Peter Povinec
IPC: G06F16/28 , G06F9/50 , G06F16/2455 , H04L41/0896 , H04L41/5025 , H04L43/0817 , H04L67/1008 , H04L67/1097
CPC classification number: G06F16/283 , G06F9/5072 , G06F16/2455 , H04L41/0896 , H04L41/5025 , H04L67/1008 , H04L67/1097 , H04L43/0817
Abstract: A method implementing a fault-tolerant data warehouse using availability zones includes allocating a plurality of processing units to a data warehouse, the processing units located in different availability zones, an availability zone comprising one or more data centers. The method further includes routing a query to a processing unit within the data warehouse, the query having a common session identifier with a query previously provided to the processing unit, the processing unit determined to be caching a data segment associated with a cloud storage resource independent of the plurality of processing units. The method further includes, as a result of monitoring a number of queries running at an input degree of parallelism, determining that the processing capacity of the processing units has reached a threshold; and changing a total number of processing units using the input degree of parallelism and the number of queries.
-
公开(公告)号:US11971856B2
公开(公告)日:2024-04-30
申请号:US16779366
申请日:2020-01-31
Applicant: Snowflake Inc.
Inventor: Selcuk Aya , Bowei Chen , Florian Andreas Funke
IPC: G06F16/174 , G06F16/22 , G06F16/27
CPC classification number: G06F16/1744 , G06F16/221 , G06F16/27
Abstract: Data in a micro-partition of a table is stored in a compressed form. In response to a database query on the table comprising a filter, the portion of the data on which the filter operates is decompressed, without decompressing other portions of the data. Using the filter on the decompressed portion of the data, the portions of the data that are responsive to the filter are determined and decompressed. The responsive data is returned in response to the database query. When a query is run on a table that is compressed using dictionary compression, the uncompressed data may be returned along with the dictionary look-up values. The recipient of the data may use the dictionary look-up values for memoization, reducing the amount of computation required to process the returned data.
-
公开(公告)号:US11868352B2
公开(公告)日:2024-01-09
申请号:US18073464
申请日:2022-12-01
Applicant: Snowflake Inc.
Inventor: Florian Andreas Funke , Megha Thakkar
IPC: G06F16/24 , G06F16/2455
CPC classification number: G06F16/2456 , G06F16/24554
Abstract: A method includes determining that an amount of available space in a first memory used by a set of relational queries is insufficient for a query, wherein the query comprises a join operation. A first partition of a set of partitions is identified, wherein the first partition possesses a smallest available probe memory size of the set of partitions and a build memory size greater than or equal to a threshold memory size, wherein the threshold memory size is a percentage of a maximum build memory size, and the largest partition of the set of partitions has the maximum build memory size. The first partition is copied from the first memory to a second memory.
-
公开(公告)号:US11630850B2
公开(公告)日:2023-04-18
申请号:US17116625
申请日:2020-12-09
Applicant: SNOWFLAKE INC.
Inventor: Florian Andreas Funke , Peter Povinec , Thierry Cruanes , Benoit Dageville
IPC: G06F16/28 , H04L67/1097 , G06F16/2455 , H04L41/0896 , H04L67/1008 , H04L41/5025 , G06F9/50 , H04L43/0817
Abstract: A method for a multi-cluster warehouse includes allocating a plurality of compute clusters as part of a virtual warehouse. The compute clusters are used to access and perform queries against one or more databases in one or more cloud storage resources. The method includes providing queries for the virtual warehouse to each of the plurality of compute clusters. Each of the plurality of compute clusters of the virtual warehouse receives a plurality of queries so that the computing load is spread across the different clusters. The method also includes dynamically adding compute clusters to and removing compute clusters from the virtual warehouse as needed based on a workload of the plurality of compute clusters.
-
-
-
-
-
-
-
-
-