Detecting data skew in a join operation

    公开(公告)号:US11347738B2

    公开(公告)日:2022-05-31

    申请号:US17502685

    申请日:2021-10-15

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.

    Tracking changes in database data
    282.
    发明授权

    公开(公告)号:US11347714B2

    公开(公告)日:2022-05-31

    申请号:US16182112

    申请日:2018-11-06

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices for tracking changes to database data. A method includes determining a change to be executed on a micro-partition of a table of a database and executing the change on the table by generating a new micro-partition that embodies the change. The method includes updating a table history that includes a log of changes made to the table, wherein each change in the log of changes includes a timestamp, and wherein updating the table history includes inserting the change into the log of changes.

    Private data exchange
    284.
    发明授权

    公开(公告)号:US11334604B2

    公开(公告)日:2022-05-17

    申请号:US16746673

    申请日:2020-01-17

    Applicant: Snowflake Inc.

    Abstract: Providing a private data exchange is described. An example computer-implemented method can include providing a data exchange by a cloud computing service on behalf of an entity. The data exchange may comprise several data listings provided by one or more data providers. The data listings reference one or more data sets stored in a data storage platform associated with the cloud computing service. The method may also include designating a data exchange administrator account of the data exchange. The data exchange administrator account may be associated with the entity and may be capable of: granting and denying requests from data consumers to access the data exchange; and granting and denying requests from data providers to publish data listings on the data exchange.

    Pruning index maintenance
    290.
    发明授权

    公开(公告)号:US11308089B2

    公开(公告)日:2022-04-19

    申请号:US17358154

    申请日:2021-06-25

    Applicant: Snowflake Inc.

    Abstract: A source table organized into a set of micro-partitions is accessed by a network-based data warehouse. A pruning index is generated based on the source table. The pruning index comprises a set of filters that indicate locations of distinct values in each column of the source table. A query directed at the source table is received at the network-based data warehouse. The query is processed using the pruning index. The processing of the query comprises pruning the set of micro-partitions of the source table to scan for data matching the query, the pruning of the plurality of micro-partitions comprising identifying, using the pruning index, a sub-set of micro-partitions to scan for the data matching the query.

Patent Agency Ranking