SYSTEMS AND METHODS FOR SELECTIVE SCANNING OF EXTERNAL PARTITIONS

    公开(公告)号:US20220114180A1

    公开(公告)日:2022-04-14

    申请号:US17561222

    申请日:2021-12-23

    Applicant: Snowflake Inc.

    Abstract: Disclosed herein are systems and methods for selective scanning of external partitions. In an embodiment, a database platform receives a query directed at least in part to an external table stored on an external data storage platform. The external table is partitioned into partitions corresponding to storage locations in the external data storage platform. The database platform prunes, using external-table metadata that is stored by the database platform and that maps the partitions of the external table to the storage locations in the external data storage platform, those partitions that do not potentially contain data that satisfies the query. The database platform identifies data that satisfies the query by scanning any one or more of the partitions of the external table that were not pruned, and responds to the query at least in part with the identified data that satisfies the query.

    DETECTING DATA SKEW IN A JOIN OPERATION

    公开(公告)号:US20220035814A1

    公开(公告)日:2022-02-03

    申请号:US17502685

    申请日:2021-10-15

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.

    PROCESSING OF QUERIES OVER EXTERNAL TABLES

    公开(公告)号:US20220027368A1

    公开(公告)日:2022-01-27

    申请号:US17498382

    申请日:2021-10-11

    Applicant: Snowflake Inc.

    Abstract: Disclosed herein are systems and methods for processing queries over external tables. In an embodiment, a database platform receives a query directed at least to data in an external table stored in a storage platform that is external to the database platform. The database platform uses metadata that summarizes the data in the external table to identify one or more partitions of the external table as potentially including data satisfying the query, and generates a query plan that includes a plurality of discrete subtasks that collectively include instructions to scan the identified one or more partitions of the external table for data satisfying the query. The database platform assigns, based on the metadata, the plurality of discrete subtasks to one or more nodes in an execution platform, and refreshes the metadata in response to a threshold number of modifications being made to the external table.

    TRANSACTIONAL PROCESSING OF CHANGE TRACKING DATA

    公开(公告)号:US20220019570A1

    公开(公告)日:2022-01-20

    申请号:US17491106

    申请日:2021-09-30

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices for transactional processing of change tracking data for a database are discussed. A method includes generating a micro-partition based on execution of a transaction on a table of a database, the micro-partition reflecting changes made to the table by the transaction. A change tracking entry is generated in response to the execution of the transaction. The change tracking entry includes an indication of one or more modifications made to the table by the transaction and an indication of the micro-partition generated based on the execution of the transaction. The change tracking entry is stored in the micro-partition as metadata. At least one existing micro-partition is removed from the table, responsive to storing the change tracking entry.

    Automated maintenance of external tables in database systems

    公开(公告)号:US11194795B2

    公开(公告)日:2021-12-07

    申请号:US16385837

    申请日:2019-04-16

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices for automated maintenance of external tables in database systems are disclosed. A method includes receiving, by a database platform, read access to content in an external data storage platform that is separate from the database platform. The method includes defining an external table based on the content in the external data storage platform. The method includes connecting the database platform to the external table such that the database platform has read access for the external table and does not have write access for the external table. The method includes generating metadata for the external table, the metadata comprising information about data stored in the external table. The method includes receiving a notification that a modification has been made to the content in the external data storage platform, the modification comprising one or more of an addition of a file, a deletion of a file, or an update to a file in a source location for the external table. The method includes refreshing the metadata for the external table in response to the modification being made to the content in the external data storage platform.

Patent Agency Ranking