-
公开(公告)号:US11347738B2
公开(公告)日:2022-05-31
申请号:US17502685
申请日:2021-10-15
Applicant: Snowflake Inc.
Inventor: Florian Andreas Funke , Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/20 , G06F16/2453 , G06F16/22 , G06F16/2455
Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.
-
公开(公告)号:US11347714B2
公开(公告)日:2022-05-31
申请号:US16182112
申请日:2018-11-06
Applicant: Snowflake Inc.
Inventor: Istvan Cseri , Torsten Grabs , Benoit Dageville
IPC: G06F17/30 , G06F16/23 , G06F16/27 , G06F16/2455
Abstract: Systems, methods, and devices for tracking changes to database data. A method includes determining a change to be executed on a micro-partition of a table of a database and executing the change on the table by generating a new micro-partition that embodies the change. The method includes updating a table history that includes a log of changes made to the table, wherein each change in the log of changes includes a timestamp, and wherein updating the table history includes inserting the change into the log of changes.
-
公开(公告)号:US20220156281A1
公开(公告)日:2022-05-19
申请号:US17665262
申请日:2022-02-04
Applicant: SNOWFLAKE INC.
Inventor: Benoit Dageville , Thierry Cruanes , Marcin Zukowski
IPC: G06F16/27 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L67/1095 , H04L67/568 , H04L67/1097
Abstract: A method and apparatus managing a set of processors for a set of queries is described. In an exemplary embodiment, a device receives a set of queries for a data warehouse, the set of queries including one or more queries to be processed by the data warehouse. The device further provisions a set of processors from a first plurality of processors, where the set of processors to process the set of queries, and a set of storage resources to store data for the set of queries. In addition, the device monitors a utilization of the set of processors as the set of processors processes the set of queries. The device additionally updates a number of the processors in the set of processors provisioned based on the utilization/ Furthermore, the device processes the set of queries using the updated set of processors.
-
公开(公告)号:US11334604B2
公开(公告)日:2022-05-17
申请号:US16746673
申请日:2020-01-17
Applicant: Snowflake Inc.
Inventor: Pui Kei Johnston Chu , Benoit Dageville , Matthew Glickman , Christian Kleinerman , Prasanna Krishnan , Justin Langseth
IPC: G06F16/00 , G06F16/28 , G06F9/50 , G06F16/2457 , G06F16/2458
Abstract: Providing a private data exchange is described. An example computer-implemented method can include providing a data exchange by a cloud computing service on behalf of an entity. The data exchange may comprise several data listings provided by one or more data providers. The data listings reference one or more data sets stored in a data storage platform associated with the cloud computing service. The method may also include designating a data exchange administrator account of the data exchange. The data exchange administrator account may be associated with the entity and may be capable of: granting and denying requests from data consumers to access the data exchange; and granting and denying requests from data providers to publish data listings on the data exchange.
-
公开(公告)号:US11334597B2
公开(公告)日:2022-05-17
申请号:US16945095
申请日:2020-07-31
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F15/167 , G06F16/27 , H04L67/568 , G06F9/50 , G06F16/14 , G06F16/182 , G06F16/21 , G06F16/22 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/2453 , G06F16/951 , G06F16/9535 , H04L67/1095 , H04L67/1097 , G06F9/48
Abstract: Example resource management systems and methods are described. In one implementation, a resource manager is configured to manage data processing tasks associated with multiple data elements. An execution platform is coupled to the resource manager and includes multiple execution nodes configured to store data retrieved from multiple remote storage devices. Each execution node includes a cache and a processor, where the cache and processor are independent of the remote storage devices. A metadata manager is configured to access metadata associated with at least a portion of the multiple data elements.
-
公开(公告)号:US20220138224A1
公开(公告)日:2022-05-05
申请号:US17573550
申请日:2022-01-11
Applicant: SNOWFLAKE INC.
Inventor: Thierry Cruanes , Benoit Dageville , Allison Waingold Lee
IPC: G06F16/27 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L67/1095 , H04L67/568 , H04L67/1097
Abstract: Embodiments of the present disclosure relate to systems and methods for executing queries on a database platform. A processing device may execute a first operator in a query plan to process a set of data and generate an intermediate result of a query. the intermediate result of the first operator may be pushed, during execution of the query plan, to a plurality of secondary operators as the intermediate result is generated. Each of the plurality of secondary operators may be initiated to concurrently process the intermediate result to generate a plurality of second results, and a timing of processing of the intermediate result by one or more of the plurality of secondary operators is adjusted to coordinate the generation of the plurality of second results. The processor may execute the final operation on the plurality of second results to generate a final result.
-
公开(公告)号:US11321325B2
公开(公告)日:2022-05-03
申请号:US17388160
申请日:2021-07-29
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Benoit Dageville , Ismail Oukid , Stefan Richter
IPC: G06F16/24 , G06F16/2455 , G06F16/9035 , G06F16/28 , G06F16/22 , G06F17/18
Abstract: A query directed at a source table organized into a set of batch units is received. The query includes a pattern matching predicate that specifies a search pattern. A set of N-grams are generated based on the search pattern. A pruning index associated with the source table is accessed. The pruning index comprises a set of filters that index distinct N-grams in each column of the source table. The pruning index is used to identify a subset of batch units to scan for matching data based on the set of N-grams generated for the search pattern. The query is processed by scanning the subset of batch units.
-
公开(公告)号:US20220129480A1
公开(公告)日:2022-04-28
申请号:US17647123
申请日:2022-01-05
Applicant: Snowflake Inc.
Inventor: Ashish Motivala , Benoit Dageville
IPC: G06F16/27 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L67/1095 , H04L67/568 , H04L67/1097
Abstract: Example systems and methods for cloning catalog objects are described. In one implementation, a method identifies an original catalog object associated with data and creates a duplicate copy of the original catalog object without copying the data itself. The method allows access to the data using the duplicate catalog object and supports modifying the data associated with the original catalog object independently of the duplicate catalog object. The duplicate catalog object can be deleted upon completion of modifying the data associated with the original catalog object.
-
公开(公告)号:US20220129478A1
公开(公告)日:2022-04-28
申请号:US17568542
申请日:2022-01-04
Applicant: SNOWFLAKE INC.
Inventor: Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/27 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L67/1095 , H04L67/568 , H04L67/1097
Abstract: A method and apparatus managing a set of processors for a set of queries is described. In an exemplary embodiment, a device receives a set of queries for a data warehouse, the set of queries including one or more queries to be processed by the data warehouse. The device further provisions a set of processors from a first plurality of processors, where the set of processors to process the set of queries, and a set of storage resources to store data for the set of queries. In addition, the device monitors a utilization of the set of processors as the set of processors processes the set of queries. The device additionally updates a number of the processors in the set of processors provisioned based on the utilization. Furthermore, the device processes the set of queries using the updated set of processors.
-
公开(公告)号:US11308089B2
公开(公告)日:2022-04-19
申请号:US17358154
申请日:2021-06-25
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Benoit Dageville , Ismail Oukid , Stefan Richter
IPC: G06F16/24 , G06F16/2455 , G06F16/9035 , G06F16/28 , G06F17/18 , G06F16/22
Abstract: A source table organized into a set of micro-partitions is accessed by a network-based data warehouse. A pruning index is generated based on the source table. The pruning index comprises a set of filters that indicate locations of distinct values in each column of the source table. A query directed at the source table is received at the network-based data warehouse. The query is processed using the pruning index. The processing of the query comprises pruning the set of micro-partitions of the source table to scan for data matching the query, the pruning of the plurality of micro-partitions comprising identifying, using the pruning index, a sub-set of micro-partitions to scan for the data matching the query.
-
-
-
-
-
-
-
-
-