-
公开(公告)号:US20220391357A1
公开(公告)日:2022-12-08
申请号:US17522276
申请日:2021-11-09
Applicant: Snowflake Inc.
Inventor: Elliott Brossard , Sukruth Komarla Sukumar , Isaac Kunen , Ju-yi Kuo , Jonathan Lee Leang , Edward Ma , Schuyler James Manchester , Polita Paulus , Saurin Shah , Igor Zinkovsky
IPC: G06F16/182 , G06F9/54 , G06F16/13 , G06F16/176 , G06F16/14
Abstract: A file access system for user defined functions (UDFs) can be implemented on a distributed database system. The system can store UDF signatures and interfaces (e.g., classes, sub-classes) that can be called by other users. Upon a UDF being called, one or more interface objects (e.g., InputStream) can be created and requests transferred to a execution node via a network channel. The execution node can implement multiple threads that are authorized and download file data from a staging location (e.g., internal stage, external stage) concurrently.
-
公开(公告)号:US11354316B2
公开(公告)日:2022-06-07
申请号:US17561222
申请日:2021-12-23
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
IPC: G06F16/00 , G06F16/2455 , G06F16/23 , G06F16/22
Abstract: Disclosed herein are systems and methods for selective scanning of external partitions. In an embodiment, a database platform receives a query directed at least in part to an external table stored on an external data storage platform. The external table is partitioned into partitions corresponding to storage locations in the external data storage platform. The database platform prunes, using external-table metadata that is stored by the database platform and that maps the partitions of the external table to the storage locations in the external data storage platform, those partitions that do not potentially contain data that satisfies the query. The database platform identifies data that satisfies the query by scanning any one or more of the partitions of the external table that were not pruned, and responds to the query at least in part with the identified data that satisfies the query.
-
公开(公告)号:US20220121683A1
公开(公告)日:2022-04-21
申请号:US17463313
申请日:2021-08-31
Applicant: Snowflake Inc.
Inventor: Vasile Paraschiv , Saurin Shah , Marianne Shaw , Nileema Shingte
IPC: G06F16/27 , G06F16/2455 , G06F16/22 , G06F9/30 , G06F16/28
Abstract: A database export system exports data using a plurality of nodes that process the data to generate structured result files that are partitioned by an export parameter in an export request. The database export system distributes the data and merges the files to avoid small file creation and increase processing speed via parallelism. The database export system generates the result files of a specified maximum size in a final format, where the files are processed merged in a temporary file format. The parallel processing is optimized and constrained per the amount of processing nodes, available memory, requested final file sizes, and operation based ordering to complete data exports in a scalable multi-stage approach.
-
公开(公告)号:US20220114180A1
公开(公告)日:2022-04-14
申请号:US17561222
申请日:2021-12-23
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
IPC: G06F16/2455 , G06F16/22 , G06F16/23
Abstract: Disclosed herein are systems and methods for selective scanning of external partitions. In an embodiment, a database platform receives a query directed at least in part to an external table stored on an external data storage platform. The external table is partitioned into partitions corresponding to storage locations in the external data storage platform. The database platform prunes, using external-table metadata that is stored by the database platform and that maps the partitions of the external table to the storage locations in the external data storage platform, those partitions that do not potentially contain data that satisfies the query. The database platform identifies data that satisfies the query by scanning any one or more of the partitions of the external table that were not pruned, and responds to the query at least in part with the identified data that satisfies the query.
-
公开(公告)号:US20220027368A1
公开(公告)日:2022-01-27
申请号:US17498382
申请日:2021-10-11
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
IPC: G06F16/2453
Abstract: Disclosed herein are systems and methods for processing queries over external tables. In an embodiment, a database platform receives a query directed at least to data in an external table stored in a storage platform that is external to the database platform. The database platform uses metadata that summarizes the data in the external table to identify one or more partitions of the external table as potentially including data satisfying the query, and generates a query plan that includes a plurality of discrete subtasks that collectively include instructions to scan the identified one or more partitions of the external table for data satisfying the query. The database platform assigns, based on the metadata, the plurality of discrete subtasks to one or more nodes in an execution platform, and refreshes the metadata in response to a threshold number of modifications being made to the external table.
-
公开(公告)号:US20210406311A1
公开(公告)日:2021-12-30
申请号:US17463325
申请日:2021-08-31
Applicant: Snowflake Inc.
Inventor: Elliott Brossard , Sukruth Komarla Sukumar , Isaac Kunen , Ju-yi Kuo , Jonathan Lee Leang , Edward Ma , Schuyler James Manchester , Polita Paulus , Saurin Shah , Igor Zinkovsky
IPC: G06F16/901 , G06F16/955 , G06F16/2455 , G06F16/22 , G06F16/908
Abstract: A file access system for user defined functions (UDFs) can be implemented on a distributed database system. The system can store UDF interfaces and file reference objects that can be called by other users. Upon a UDF being called, files on a stage, one or more interface objects (e.g., InputStream), and file reference objects can be implemented by execution nodes of the distributed database system. The execution nodes can implement multiple threads that are authenticated and can download file data from a staging location concurrently.
-
公开(公告)号:US11194795B2
公开(公告)日:2021-12-07
申请号:US16385837
申请日:2019-04-16
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
Abstract: Systems, methods, and devices for automated maintenance of external tables in database systems are disclosed. A method includes receiving, by a database platform, read access to content in an external data storage platform that is separate from the database platform. The method includes defining an external table based on the content in the external data storage platform. The method includes connecting the database platform to the external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes generating metadata for the external table, the metadata comprising information about data stored in the external table. The method includes receiving a notification that a modification has been made to the content in the external data storage platform, the modification comprising one or more of an addition of a file, a deletion of a file, or an update to a file in a source location for the external table. The method includes refreshing the metadata for the external table in response to the modification being made to the content in the external data storage platform.
-
公开(公告)号:US20210216541A1
公开(公告)日:2021-07-15
申请号:US17219854
申请日:2021-03-31
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
Abstract: Systems, methods, and devices for automated maintenance of external tables in database systems are disclosed. A method includes receiving, by a database platform, read access to content in an external data storage platform that is separate from the database platform. The method includes defining an external table based on the content in the external data storage platform. The method includes connecting the database platform to the external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes generating metadata for the external table, the metadata comprising information about data stored in the external table. The method includes receiving a notification that a modification has been made to the content in the external data storage platform, the modification comprising one or more of an addition of a file, a deletion of a file, or an update to a file in a source location for the external table. The method includes refreshing the metadata for the external table in response to the modification being made to the content in the external data storage platform.
-
公开(公告)号:US20200334242A1
公开(公告)日:2020-10-22
申请号:US16842942
申请日:2020-04-08
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
Abstract: Systems, methods, and devices for automated maintenance of external tables in database systems are disclosed. A method includes receiving, by a database platform, read access to content in an external data storage platform that is separate from the database platform. The method includes defining an external table based on the content in the external data storage platform. The method includes connecting the database platform to the external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes generating metadata for the external table, the metadata comprising information about data stored in the external table. The method includes receiving a notification that a modification has been made to the content in the external data storage platform, the modification comprising one or more of an addition of a file, a deletion of a file, or an update to a file in a source location for the external table. The method includes refreshing the metadata for the external table in response to the modification being made to the content in the external data storage platform.
-
公开(公告)号:US20200334241A1
公开(公告)日:2020-10-22
申请号:US16841831
申请日:2020-04-07
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
IPC: G06F16/242 , G06F3/06 , G06F16/23 , G06F16/25 , G06F16/2453
Abstract: Systems, methods, and devices for querying over an external table are disclosed. A method includes connecting a database platform to an external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes receiving a query comprising a predicate, the query directed at least to data in the external table. The method includes determining, based on metadata, one or more partitions in the external table comprising data satisfying the predicate. The method includes pruning, based on the metadata, all partitions in the external table that do not comprise any data satisfying the predicate. The method includes generating a query plan comprising a plurality of discrete subtasks. The method includes assigning, based on the metadata, the plurality of discrete subtasks to one or more nodes in an execution platform.
-
-
-
-
-
-
-
-
-