Parallel scan of single file using multiple threads

    公开(公告)号:US11586621B1

    公开(公告)日:2023-02-21

    申请号:US17586493

    申请日:2022-01-27

    Applicant: Snowflake Inc.

    Abstract: Multiple execution threads process a query directed to a database organized into a plurality of files. In processing the query, a first thread downloads a file from the plurality of files. The file comprises a set of blocks. A parallel scan of the set of blocks is performed by at least the first thread and a second thread to identify data that matches the query. A response to the query is provided based in part on the parallel scan of the set of blocks.

    Detecting data skew in a join operation

    公开(公告)号:US11347738B2

    公开(公告)日:2022-05-31

    申请号:US17502685

    申请日:2021-10-15

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.

    Systems, methods, and devices for managing data skew in a join operation

    公开(公告)号:US11176136B2

    公开(公告)日:2021-11-16

    申请号:US17249794

    申请日:2021-03-12

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.

    SYSTEMS, METHODS, AND DEVICES FOR MANAGING DATA SKEW IN A JOIN OPERATION

    公开(公告)号:US20210326341A1

    公开(公告)日:2021-10-21

    申请号:US17364752

    申请日:2021-06-30

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.

    EFFICIENT DATABASE QUERY EVALUATION

    公开(公告)号:US20210240670A1

    公开(公告)日:2021-08-05

    申请号:US16779366

    申请日:2020-01-31

    Applicant: Snowflake Inc

    Abstract: Data in a micro-partition of a table is stored in a compressed form. In response to a database query on the table comprising a filter, the portion of the data on which the filter operates is decompressed, without decompressing other portions of the data. Using the filter on the decompressed portion of the data, the portions of the data that are responsive to the filter are determined and decompressed. The responsive data is returned in response to the database query. When a query is run on a table that is compressed using dictionary compression, the uncompressed data may be returned along with the dictionary look-up values. The recipient of the data may use the dictionary look-up values for memoization, reducing the amount of computation required to process the returned data.

    Systems, methods, and devices for managing data skew in a join operation

    公开(公告)号:US10970283B2

    公开(公告)日:2021-04-06

    申请号:US16716819

    申请日:2019-12-17

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.

    Systems, methods, and devices for managing data skew in a join operation

    公开(公告)号:US10970282B2

    公开(公告)日:2021-04-06

    申请号:US16005182

    申请日:2018-06-11

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices, for managing data skew during a join operation are disclosed. A method includes computing a hash value for a join operation and detecting data skew on a probe side of the join operation at a runtime of the join operation using a lightweight sketch data structure. The method includes identifying a frequent probe-side join key on the probe side of the join operation during a probe phase of the join operation. The method includes identifying a frequent build-side row having a build-side join key corresponding with the frequent probe-side join key. The method includes asynchronously distributing the frequent build-side row to one or more remote servers.

Patent Agency Ranking