-
公开(公告)号:US20220114194A1
公开(公告)日:2022-04-14
申请号:US17645275
申请日:2021-12-20
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Benoit Dageville , Allison Waingold Lee
IPC: G06F16/27 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L67/1095 , H04L67/568 , H04L67/1097
Abstract: A system and method for managing data storage and data access with querying data in a distributed system without buffering the results on intermediate operations in disk storage.
-
公开(公告)号:US20220083684A1
公开(公告)日:2022-03-17
申请号:US17537312
申请日:2021-11-29
Applicant: Snowflake Inc.
Inventor: Benoit Dageville , Peter Povinec , Philipp Thomas Unterbrunner , Martin Hentschel
Abstract: A method for encrypting database files includes generating a mapping for a plurality of encrypted files. A first encrypted file of the plurality of encrypted files is encrypted with a first encryption key. The method includes generating a second encrypted file by re-encrypting, for a period of time, data in the first encrypted file using a second encryption key. The first encrypted file remains accessible to one or more queries during the period of time. The method includes updating the mapping to associate the second encrypted file with the first encrypted file. The mapping is updated after the second encrypted file has been generated. The method includes preventing a query from accessing the first encrypted file after the second encrypted file has been generated.
-
公开(公告)号:US20220075776A1
公开(公告)日:2022-03-10
申请号:US17455798
申请日:2021-11-19
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
Abstract: Disclosed herein are systems and methods for pruning external data. In an embodiment, a database platform receives a query directed at least in part to external data in an external table on an external data storage platform. The external table includes partitions that correspond to storage locations in a source directory of the external data storage platform. The storage locations contain files that contain the external data. The database platform identifies, from external-table metadata that is stored by the database platform and that maps the partitions of the external table to the storage locations in the source directory, a subset of the partitions as including data that potentially satisfies the query. The database platform identifies data that satisfies the query by scanning the identified subset of the partitions, and responds to the query at least in part with the identified data that satisfies the query.
-
公开(公告)号:US11269921B2
公开(公告)日:2022-03-08
申请号:US17378574
申请日:2021-07-16
Applicant: SNOWFLAKE INC.
Inventor: Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/27 , G06F16/951 , G06F16/182 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L67/1095 , H04L67/568 , H04L67/1097 , H04L29/08
Abstract: A method and apparatus managing a set of processors for a set of queries is described. In an exemplary embodiment, a device receives a set of queries for a data warehouse, the set of queries including one or more queries to be processed by the data warehouse. The device further provisions a set of processors from a first plurality of processors, where the set of processors to process the set of queries, and a set of storage resources to store data for the set of queries. In addition, the device monitors a utilization of the set of processors as the set of processors processes the set of queries. The device additionally updates a number of the processors in the set of processors provisioned based on the utilization/Furthermore, the device processes the set of queries using the updated set of processors.
-
公开(公告)号:US11269868B2
公开(公告)日:2022-03-08
申请号:US17219854
申请日:2021-03-31
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
Abstract: Systems, methods, and devices for automated maintenance of external tables in database systems are disclosed. A method includes receiving, by a database platform, read access to content in an external data storage platform that is separate from the database platform. The method includes defining an external table based on the content in the external data storage platform. The method includes connecting the database platform to the external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes generating metadata for the external table, the metadata comprising information about data stored in the external table. The method includes receiving a notification that a modification has been made to the content in the external data storage platform, the modification comprising one or more of an addition of a file, a deletion of a file, or an update to a file in a source location for the external table. The method includes refreshing the metadata for the external table in response to the modification being made to the content in the external data storage platform.
-
公开(公告)号:US20210390083A1
公开(公告)日:2021-12-16
申请号:US17459334
申请日:2021-08-27
Applicant: SNOWFLAKE INC.
Inventor: Pui Kei Johnston Chu , Benoit Dageville , Shreyas Narendra Desai , German Alberto Gil Echeverri , Prasanna Krishnan , Vishnu Dutt Paladugu , Bowen Zhang
IPC: G06F16/182 , G06F16/11 , G06F16/17 , G06F9/54
Abstract: Provided herein are systems and methods for an efficient method of replicating share objects to remote deployments. For example, the method may comprise modifying a share object of a first account of a data exchange into a global object wherein the share object includes grant metadata indicating share grants to a set of objects of a database. The method may further comprise creating, in a second account of the data exchange, a local replica of the share object on the remote deployment based on the global object, wherein the second account is located in a remote deployment. The set of objects of the database may be replicated to a local database replica on the remote deployment and the share grants may be replicated to the local replica of the share object.
-
公开(公告)号:US11176136B2
公开(公告)日:2021-11-16
申请号:US17249794
申请日:2021-03-12
Applicant: Snowflake Inc.
Inventor: Florian Andreas Funke , Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/20 , G06F16/2453 , G06F16/22 , G06F16/2455
Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.
-
公开(公告)号:US20210344747A1
公开(公告)日:2021-11-04
申请号:US17378562
申请日:2021-07-16
Applicant: SNOWFLAKE INC.
Inventor: Pui Kei Johnston Chu , Benoit Dageville , Matthew Glickman , Christian Kleinerman , Prasanna Krishnan , Justin Langseth
Abstract: Sharing data in a data exchange across multiple cloud computing platforms and/or cloud computing platform regions is described. An example computer-implemented method can include receiving data sharing information from a data provider for sharing a data set in a data exchange from a first cloud computing entity to a set of second cloud computing entities. In response to receiving the data sharing information, the method may also include creating an account with each of the set of second cloud computing entities. The method may also further include sharing the data set from the first cloud computing entity with the set of second cloud computing entities using at least the corresponding account of that second cloud computing entity.
-
公开(公告)号:US11157516B2
公开(公告)日:2021-10-26
申请号:US17141220
申请日:2021-01-04
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/27 , G06F9/50 , G06F9/48 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , H04L29/08
Abstract: A method and apparatus managing a set of processors for a set of queries is described. In an exemplary embodiment, a device receives a set of queries for a data warehouse, the set of queries including one or more queries to be processed by the data warehouse. The device further provisions a set of processors from a first plurality of processors, where the set of processors to process the set of queries, and a set of storage resources to store data for the set of queries. In addition, the device monitors a utilization of the set of processors as the set of processors processes the set of queries. The device additionally updates a number of the processors in the set of processors provisioned based on the utilization/Furthermore, the device processes the set of queries using the updated set of processors.
-
公开(公告)号:US20210326341A1
公开(公告)日:2021-10-21
申请号:US17364752
申请日:2021-06-30
Applicant: Snowflake Inc.
Inventor: Florian Andreas Funke , Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/2453 , G06F16/22 , G06F16/2455
Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.
-
-
-
-
-
-
-
-
-