-
11.
公开(公告)号:US20240427755A1
公开(公告)日:2024-12-26
申请号:US18517744
申请日:2023-11-22
Applicant: Snowflake Inc.
Inventor: Selcuk Aya , Thierry Cruanes , Istvan Cseri , Benoit Dageville , Marcia Feitel , Steven P. Herbert , Dennis Huo , Xinglian Liu , Nithin Mahesh , James Malone , Subramanian Muralidhar , Muthunagappan Muthuraman , Ronald Lee Ortloff , Polita Paulus , Marianne Shaw , Nileema Shingte , Wai Sing Wong , Jiaqi Yan
IPC: G06F16/22 , G06F16/215 , G06F16/2457
Abstract: The subject technology provides embodiments for supporting a unified table which may be a managed table or an unmanaged table. Managed tables are those where the subject technology manages the metastore/catalog for the table, whereas unmanaged tables are tables where an external catalog controls the table and the subject technology integrates with that catalog to work with the table, but does not assume control of the table.
-
12.
公开(公告)号:US12050582B1
公开(公告)日:2024-07-30
申请号:US18498463
申请日:2023-10-31
Applicant: Snowflake Inc.
Inventor: Selcuk Aya , Thierry Cruanes , Istvan Cseri , Benoit Dageville , Marcia Feitel , Steven P. Herbert , Dennis Huo , Xinglian Liu , Nithin Mahesh , James Malone , Subramanian Muralidhar , Muthunagappan Muthuraman , Ronald Lee Ortloff , Polita Paulus , Marianne Shaw , Nileema Shingte , Wai Sing Wong , Jiaqi Yan
IPC: G06F16/22 , G06F16/215 , G06F16/2457
CPC classification number: G06F16/2282 , G06F16/215 , G06F16/24573
Abstract: The subject technology provides embodiments for supporting a unified table which may be a managed table or an unmanaged table. Managed tables are those where the subject technology manages the metastore/catalog for the table, whereas unmanaged tables are tables where an external catalog controls the table and the subject technology integrates with that catalog to work with the table, but does not assume control of the table.
-
公开(公告)号:US11960505B2
公开(公告)日:2024-04-16
申请号:US17664144
申请日:2022-05-19
Applicant: Snowflake Inc.
Inventor: Vasile Paraschiv , Saurin Shah , Marianne Shaw , Nileema Shingte
IPC: G06F16/27 , G06F9/30 , G06F16/13 , G06F16/182 , G06F16/22 , G06F16/2455 , G06F16/28 , G06F16/11 , G06F16/25
CPC classification number: G06F16/278 , G06F9/3009 , G06F16/137 , G06F16/182 , G06F16/2282 , G06F16/24554 , G06F16/283 , G06F16/116 , G06F16/254
Abstract: A database export system exports data using a plurality of nodes that process the data to generate structured result files that are partitioned by an export parameter in an export request. The database export system distributes the data and merges the files to avoid small file creation and increase processing speed via parallelism. The database export system generates the result files of a specified maximum size in a final format, where the files are processed merged in a temporary file format. The parallel processing is optimized and constrained per the amount of processing nodes, available memory, requested final file sizes, and operation based ordering to complete data exports in a scalable multi-stage approach.
-
公开(公告)号:US11675780B2
公开(公告)日:2023-06-13
申请号:US17650462
申请日:2022-02-09
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
IPC: G06F16/20 , G06F16/242 , G06F3/06 , G06F16/2453 , G06F16/25 , G06F16/23
CPC classification number: G06F16/2423 , G06F3/0605 , G06F3/067 , G06F3/0644 , G06F16/2393 , G06F16/24535 , G06F16/24542 , G06F16/254
Abstract: Disclosed herein are embodiments of systems and methods for partition-based scanning of external tables for query processing. In an example embodiment, a database platform receives a query that includes one or more predicates, where the query is directed at least to data in an external table that is stored in an external storage platform that is external to the database platform. The database platform identifies, based on metadata that summarizes the data in the external table, one or more partitions of the external table that potentially include data that satisfies the one or more predicates. The database platform also identifies, from the one or more identified partitions, data that satisfies the one or more predicates. The database platform sends a response to the query to the client, the response comprising the data satisfying the one or more predicates.
-
公开(公告)号:US20220121673A1
公开(公告)日:2022-04-21
申请号:US17086221
申请日:2020-10-30
Applicant: Snowflake Inc.
Inventor: Vasile Paraschiv , Saurin Shah , Marianne Shaw , Nileema Shingte
IPC: G06F16/25 , G06F16/13 , G06F16/182 , G06F16/11
Abstract: A database export system exports data using a plurality of nodes that process the data to generate structured result files that are partitioned by an export parameter in an export request. The database export system distributes the data and merges the files to avoid small file creation and increase processing speed via parallelism. The database export system generates the result files of a specified maximum size in a final format, where the files are processed merged in a temporary file format. The parallel processing is optimized and constrained per the amount of processing nodes, available memory, requested final file sizes, and operation based ordering to complete data exports in a scalable multi-stage approach.
-
公开(公告)号:US20220075776A1
公开(公告)日:2022-03-10
申请号:US17455798
申请日:2021-11-19
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
Abstract: Disclosed herein are systems and methods for pruning external data. In an embodiment, a database platform receives a query directed at least in part to external data in an external table on an external data storage platform. The external table includes partitions that correspond to storage locations in a source directory of the external data storage platform. The storage locations contain files that contain the external data. The database platform identifies, from external-table metadata that is stored by the database platform and that maps the partitions of the external table to the storage locations in the source directory, a subset of the partitions as including data that potentially satisfies the query. The database platform identifies data that satisfies the query by scanning the identified subset of the partitions, and responds to the query at least in part with the identified data that satisfies the query.
-
公开(公告)号:US11269868B2
公开(公告)日:2022-03-08
申请号:US17219854
申请日:2021-03-31
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
Abstract: Systems, methods, and devices for automated maintenance of external tables in database systems are disclosed. A method includes receiving, by a database platform, read access to content in an external data storage platform that is separate from the database platform. The method includes defining an external table based on the content in the external data storage platform. The method includes connecting the database platform to the external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes generating metadata for the external table, the metadata comprising information about data stored in the external table. The method includes receiving a notification that a modification has been made to the content in the external data storage platform, the modification comprising one or more of an addition of a file, a deletion of a file, or an update to a file in a source location for the external table. The method includes refreshing the metadata for the external table in response to the modification being made to the content in the external data storage platform.
-
公开(公告)号:US11138232B1
公开(公告)日:2021-10-05
申请号:US17086215
申请日:2020-10-30
Applicant: Snowflake Inc.
Inventor: Vasile Paraschiv , Saurin Shah , Marianne Shaw , Nileema Shingte
IPC: G06F16/27 , G06F16/28 , G06F9/30 , G06F16/22 , G06F16/2455
Abstract: A database export system exports data using a plurality of nodes that process the data to generate structured result files that are partitioned by an export parameter in an export request. The database export system distributes the data and merges the files to avoid small file creation and increase processing speed via parallelism. The database export system generates the result files of a specified maximum size in a final format, where the files are processed merged in a temporary file format. The parallel processing is optimized and constrained per the amount of processing nodes, available memory, requested final file sizes, and operation based ordering to complete data exports in a scalable multi-stage approach.
-
公开(公告)号:US11030191B2
公开(公告)日:2021-06-08
申请号:US16841831
申请日:2020-04-07
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
IPC: G06F16/242 , G06F3/06 , G06F16/2453 , G06F16/25 , G06F16/23
Abstract: Systems, methods, and devices for querying over an external table are disclosed. A method includes connecting a database platform to an external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes receiving a query comprising a predicate, the query directed at least to data in the external table. The method includes determining, based on metadata, one or more partitions in the external table comprising data satisfying the predicate. The method includes pruning, based on the metadata, all partitions in the external table that do not comprise any data satisfying the predicate. The method includes generating a query plan comprising a plurality of discrete subtasks. The method includes assigning, based on the metadata, the plurality of discrete subtasks to one or more nodes in an execution platform.
-
公开(公告)号:US11899646B2
公开(公告)日:2024-02-13
申请号:US18193069
申请日:2023-03-30
Applicant: Snowflake Inc.
Inventor: Selcuk Aya , Thierry Cruanes , Istvan Cseri , Benoit Dageville , Marcia Feitel , Steven P. Herbert , Xinglian Liu , James Malone , Subramanian Muralidhar , Muthunagappan Muthuraman , Polita Paulus , Marianne Shaw , Nileema Shingte , Wai Sing Wong , Jiaqi Yan
CPC classification number: G06F16/2282 , G06F16/2379 , G06F16/258
Abstract: The subject technology receives a command to commit a table in a different table format on an external volume. The subject technology generates a first snapshot of the table on internal storage. The subject technology generates a first list of metadata files on the internal storage. The subject technology generates, based on the first list of metadata files, a first set of metadata files on the internal storage. The subject technology generates a second snapshot of the table on the external volume. The subject technology generates a second list of metadata files on the external volume. The subject technology generates, based on the second list of metadata files, a second set of metadata files on the external volume. The subject technology generates a first set of data files in a different file format on the external volume.
-
-
-
-
-
-
-
-
-