Materialized views over external tables in database systems

    公开(公告)号:US11507571B2

    公开(公告)日:2022-11-22

    申请号:US16385720

    申请日:2019-04-16

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices for generating a materialized view over an external table. A method includes connecting a database platform to an external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes generating, by the database platform, a materialized view over the external table. The method includes receiving a notification that a modification has been made to the external table, the modification comprising one or more of an addition of a file, a deletion of a file, or an update to a file in a source location for the external table. The method includes, in response to the external table being modified, refreshing the materialized view such that the materialized view comprises an accurate representation of the external table.

    Feature release and workload capture in database systems

    公开(公告)号:US11500838B1

    公开(公告)日:2022-11-15

    申请号:US17869071

    申请日:2022-07-20

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices for feature release and workload capture in database systems are disclosed. The method includes determining a workload based on one or more client queries to be rerun to test a feature that is unreleased to one or more database clients. The method includes repeatedly executing a test run of the workload to determine a stability factor of the test run. The method includes re-executing, in response to determining the stability factor of the test run, the test run using resources with a different concurrency to confirm the stability factor of the test run. The method includes releasing the feature to the one or more database clients in response to confirming the stability factor of the test run.

    Population of file-catalog table for file stage

    公开(公告)号:US11494438B2

    公开(公告)日:2022-11-08

    申请号:US17645415

    申请日:2021-12-21

    Applicant: Snowflake Inc.

    Abstract: Disclosed herein are systems and methods for population of a file-catalog table for a file stage in a user account on a data platform. In an embodiment, a data platform receives, from a client associated with a user account, a request to populate a file-catalog table of the user account based on a plurality of files stored in a file stage of the user account. The data platform responsively executes a list-files table function with respect to the file stage to generate a database-table object having a row for each file stored in the file stage. The data platform populates the file-catalog table of the user account based on the database-table object generated by the list-files table function.

    Data pruning based on metadata
    175.
    发明授权

    公开(公告)号:US11494337B2

    公开(公告)日:2022-11-08

    申请号:US17508705

    申请日:2021-10-22

    Applicant: SNOWFLAKE INC.

    Abstract: A system and method for pruning data based on metadata. The method may include receiving a query with a plurality of predicates and identifying one or more applicable files that includes database data satisfying at least one of the plurality of predicates. The identifying the one or more applicable files including reading metadata stored in a metadata store that is separate from the database data. The method further includes pruning inapplicable files comprising database data that does not satisfy at least one of the plurality of predicates to create a reduced set of files and reading the reduced set of files to execute the query.

    Pruning cutoffs for database systems

    公开(公告)号:US11475011B2

    公开(公告)日:2022-10-18

    申请号:US17540945

    申请日:2021-12-02

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives, during a query compilation process, a query directed to a set of source tables, each source table from the set of source tables being organized into at least one micro-partition and the query including at least one pruning operation. The subject technology performs, during the query compilation process, a modification of the query for adjusting the at least one pruning operation, the modification being based at least in part on a set of statistics collected for previous pruning operations on at least a portion of the set of source tables and a set of heuristics. The subject technology compiles the query including the modification of the query. The subject technology provides the compiled query to an execution node of a database system for execution.

    Resilience testing engine
    178.
    发明授权

    公开(公告)号:US11467945B1

    公开(公告)日:2022-10-11

    申请号:US17654884

    申请日:2022-03-15

    Applicant: Snowflake Inc.

    Abstract: Provided herein are systems and methods for resilience testing. A system includes at least one hardware processor coupled to a memory and configured to decode a workflow to obtain a workload specification and a failure experiment specification. A first set of containers is configured to execute one or more workloads on a testing node. The one or more workloads are defined by the workload specification. A second set of containers is configured to execute one or more failure experiments on the testing node. The one or more failure experiments are based on the failure experiment specification. Execution of the one or more failure experiments triggers an error condition on the testing node. A notification is generated based on at least one metric associated with execution of the one or more workloads and the one or more failure experiments.

    IDENTIFICATION OF OPTIMAL CLOUD RESOURCES FOR EXECUTING WORKLOADS

    公开(公告)号:US20220318215A1

    公开(公告)日:2022-10-06

    申请号:US17842642

    申请日:2022-06-16

    Applicant: SNOWFLAKE INC.

    Abstract: A system to repeatedly execute a test run of a workload using resources of a cloud environment to determine whether there is a performance difference in the test run. The system to, in response to determining that there is no performance difference, identify one or more sets of decreased resources of the cloud environment. The system to re-execute the test run using the one or more sets of decreased resources of the cloud environment to determine whether there is a performance difference in the test run that is attributed to the one or more sets of decreased resources of the cloud environment. The system to determine minimum resources of the cloud environment to repeatedly execute the test run using the minimum resources without existence of a performance difference in response to re-executing the test run using the one or more sets of decreased resources of the cloud environment.

    Parallel fetching of query result data

    公开(公告)号:US11449520B1

    公开(公告)日:2022-09-20

    申请号:US17501992

    申请日:2021-10-14

    Applicant: Snowflake Inc.

    Abstract: Provided herein are systems and methods for query result data processing, including parallel fetching and processing of query result data. A system includes at least one hardware processor coupled to memory and configured to obtain query result information associated with query result data. Multiple result batches are generated based on the query result information. Each result batch of the multiple result batches includes location information and schema information associated with a portion of the query result data. A data processing request corresponding to the result batch is detected. The portion of the query result data associated with the result batch is retrieved in response to the data processing request. The retrieving uses the location information within the result batch. The portion of the query result data is parsed using the schema information, to generate parsed result data.

Patent Agency Ranking