-
公开(公告)号:US11106678B2
公开(公告)日:2021-08-31
申请号:US17086279
申请日:2020-10-30
Applicant: Snowflake Inc.
Inventor: Benoit Dageville , Yi Fang , Martin Hentschel , Ashish Motivala , Spyridon Triantafyllis , Yizhi Zhu
IPC: G06F16/2455 , G06F16/23 , G06F16/2457 , G06F16/22 , G06F16/2458 , G06F16/27
Abstract: The subject technology receives first metadata corresponding to a set of micro-partitions. The subject technology stores a first data structure and a second data structure in storage as a first file and a second file, first data structure including the first metadata and a second data structure including second metadata, the first metadata corresponding to a set of micro-partitions, the second metadata for a grouping of the first metadata, the second data structure including information associating the second metadata to the first metadata. The subject technology stores third metadata for a table, the third metadata comprising: cumulative table metadata comprising global information about a plurality of micro-partitions of the table, the cumulative table metadata being stored in a metadata micro-partition associated with the table.
-
公开(公告)号:US20210256153A1
公开(公告)日:2021-08-19
申请号:US17228379
申请日:2021-04-12
Applicant: Snowflake Inc.
Inventor: Benoit Dageville , Peter Povinec , Philipp Thomas Unterbrunner , Martin Hentschel
Abstract: A method for encrypting database data includes generating an encryption key for a first file stored in a data store, wherein a table in a database comprises an entry pointing to the first file. The method includes generating a second file by encrypting the data the first file in the data store using the encryption key without modifying the first file. The method includes, in response to generating the second file, modifying the entry in the table to point to the second file, wherein the modification of the entry is performed atomically. A process for rekeying from the first file to the second file may happen in the background without blocking, interfering, or otherwise obstructing user interaction with a database system.
-
公开(公告)号:US20210248160A1
公开(公告)日:2021-08-12
申请号:US17244578
申请日:2021-04-29
Applicant: SNOWFLAKE INC.
Inventor: Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/27 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L29/08
Abstract: A method and apparatus managing a set of processors for a set of queries is described. In an exemplary embodiment, a device provision a set of computing resources of a database system, the set of computing resources to process a set of queries of the database system and determines a utilization of the set of computing resources during processing of the set of queries. The device further updates the set of computing resources based on the utilization of the set of computing resources by the set of queries. Updating the set of computing resources includes updating a number of processors and a set of storage resources to process the set of queries of the database system, the set of storage resources being shared by each of the processors and processes the set of queries using the set of computing resources as updated.
-
公开(公告)号:US11086875B2
公开(公告)日:2021-08-10
申请号:US17161115
申请日:2021-01-28
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Benoit Dageville , Ismail Oukid , Stefan Richter
IPC: G06F16/24 , G06F16/2455 , G06F16/22 , G06F17/18 , G06F16/28 , G06F16/9035
Abstract: A source table organized into a set of micro-partitions is accessed by a network-based data warehouse. A pruning index is generated based on the source table. The pruning index comprises a set of filters that indicate locations of distinct values in each column of the source table. A query directed at the source table is received at the network-based data warehouse. The query is processed using the pruning index. The processing of the query comprises pruning the set of micro-partitions of the source table to scan for data matching the query, the pruning of the plurality of micro-partitions comprising identifying, using the pruning index, a sub-set of micro-partitions to scan for the data matching the query.
-
公开(公告)号:US11080257B2
公开(公告)日:2021-08-03
申请号:US16410695
申请日:2019-05-13
Applicant: Snowflake Inc.
Inventor: Istvan Cseri , Torsten Grabs , Thierry Cruanes , Subramanian Muralidhar , Benoit Dageville
Abstract: Systems, methods, and devices for storing database data in journal tables comprising a snapshot and a log table. A method includes defining a journal table comprising a snapshot and a log table, the snapshot comprising an up-to-date representation of data in the journal table at a point in time. The method includes assigning a timestamp to the snapshot indicating when the snapshot was generated. The method includes receiving a request to execute a transaction on the journal table to modify the data in the journal table, the transaction comprising one or more of an insert, a delete, an update, or a merge. The method includes inserting a new row into the log table in lieu of executing the transaction on the snapshot of the journal table, the new row comprising an indication of a change requested to be made to the journal table based on the transaction.
-
公开(公告)号:US20210200769A1
公开(公告)日:2021-07-01
申请号:US17249794
申请日:2021-03-12
Applicant: Snowflake Inc.
Inventor: Florian Andreas Funke , Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/2453 , G06F16/22 , G06F16/2455
Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.
-
公开(公告)号:US11048687B2
公开(公告)日:2021-06-29
申请号:US17086245
申请日:2020-10-30
Applicant: Snowflake Inc.
Inventor: Benoit Dageville , Martin Hentschel , William Waddington
IPC: G06F16/00 , G06F16/23 , G06F21/60 , G06F16/22 , G06F16/2455
Abstract: The subject technology generates and stores a new version set of one or more table-metadata files, the new version set of one or more table-metadata files comprising table metadata for a new version of a database table. The subject technology determines that a plurality of table-metadata files are not included in a cache. The subject technology downloads, in parallel, the plurality of table-metadata files from immutable storage. The subject technology stores, in the cache, the plurality of table-metadata files. The subject technology reads, among the plurality of table-metadata files, a first table-metadata file before a second table-metadata file has been fully downloaded, the plurality of table-metadata files comprising at least the first table-metadata file and the second table-metadata file.
-
公开(公告)号:US11030186B2
公开(公告)日:2021-06-08
申请号:US16662645
申请日:2019-10-24
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Benoit Dageville , Prasanna Rajaperumal , Jiaqi Yan
Abstract: Systems, methods, and devices for incrementally refreshing a materialized view are disclosed. A method includes generating a materialized view based on a source table. The method includes merging the source table and the materialized view to generate a merged table to identify whether an update has been executed on the source table that is not reflected in the materialized view. The method includes, in response to detecting an update made to the source table that is not reflected in the materialized view, applying the update to the materialized view.
-
公开(公告)号:US10997179B1
公开(公告)日:2021-05-04
申请号:US17086228
申请日:2020-10-30
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Benoit Dageville , Ismail Oukid , Stefan Richter
IPC: G06F16/24 , G06F16/2455 , G06F16/9035 , G06F16/28 , G06F16/22 , G06F17/18
Abstract: A query directed at a source table organized into a set of batch units is received. The query includes a pattern matching predicate that specifies a search pattern. A set of N-grams are generated based on the search pattern. A pruning index associated with the source table is accessed. The pruning index comprises a set of filters that index distinct N-grams in each column of the source table. The pruning index is used to identify a subset of batch units to scan for matching data based on the set of N-grams generated for the search pattern. The query is processed by scanning the subset of batch units.
-
公开(公告)号:US10997163B2
公开(公告)日:2021-05-04
申请号:US16943251
申请日:2020-07-30
Applicant: Snowflake Inc.
Inventor: Benoit Dageville , Varun Ganesh , Jiansheng Huang , Jiaxing Liang , Haowei Yu , Scott Ziegler
Abstract: The subject technology at a data system, an ingest request to ingest one or more files into a table. The subject technology, after obtaining the ingest request and prior to the ingesting of the one or more files, persists the one or more files in a first file queue that corresponds to the table, the first file queue further corresponding to a client account, and the data system further comprising a second file queue that corresponds to both a second client account and a second table. The subject technology ingests, by one or more execution nodes, the one or more files into one or more micro-partitions of the table, each of the one or more micro-partitions comprising contiguous units of storage of a storage device.
-
-
-
-
-
-
-
-
-