System and method for disjunctive joins using a lookup table

    公开(公告)号:US11010378B1

    公开(公告)日:2021-05-18

    申请号:US16818485

    申请日:2020-03-13

    Applicant: Snowflake Inc.

    Abstract: Joining data using a disjunctive operator using a lookup table is described. An example computer-implemented method can include receiving a query with a set of conjunctive predicates and a set of disjunctive predicates. The method may also include generating a lookup table for each predicate in the sets of conjunctive predicates and disjunctive predicates. The method, for each row in a probe-side table, may also further include looking up a value associated with that row in each of the lookup tables and adding the row to a results set when there is a match. Additionally, the method may also include returning the results set.

    PIPELINE LEVEL OPTIMIZATION OF AGGREGATION OPERATORS IN A QUERY PLAN DURING RUNTIME

    公开(公告)号:US20210089535A1

    公开(公告)日:2021-03-25

    申请号:US16857817

    申请日:2020-04-24

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives a query plan, the query plan comprising a set of query operations, the set of query operations including at least one aggregation and a join operation, the join operation including a build side and a probe side. The subject technology inserts an aggregation operator below the probe side of the join operation. The subject technology causes the build side of the join operation to generate a hash table. The subject technology causes the build side of the join operation to generate a bloom filter based at least in part on the hash table and provide information, corresponding to properties of the build side, to a bloom filter. Based at least in part on the information, the subject technology determines at least one property of the join operation to determine whether to switch the aggregation operator to a pass through mode.

    BUILD-SIDE SKEW HANDLING FOR JOIN OPERATIONS
    13.
    发明公开

    公开(公告)号:US20240273096A1

    公开(公告)日:2024-08-15

    申请号:US18644323

    申请日:2024-04-24

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/24537 G06F16/2255

    Abstract: A method includes generating, by at least one hardware processor of a first computing node, a plurality of hash values using build-side row data. A frequent hash value of the plurality of hash values is detected based on row size associated with a plurality of build-side row sets including the build-side row data. A plurality of hash partitions of the build-side row data is generated using a build-side row set of the plurality of build-side row sets that includes the frequent hash value. The plurality of hash partitions of the build-side row data is distributed to a corresponding plurality of hash-join-build (HJB) instances associated with a plurality of join operations.

    BUILD-SIDE SKEW HANDLING FOR HASH-PARTITIONING HASH JOINS

    公开(公告)号:US20240134851A1

    公开(公告)日:2024-04-25

    申请号:US18047872

    申请日:2022-10-18

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/24537 G06F16/2255

    Abstract: Provided herein are systems and methods for handling build-side skew. For example, a method includes computing a plurality of hash values for a join operation. The join operation uses a corresponding plurality of row sets. The plurality of hash values are sampled to detect a frequent hash value. A build-side row set is partitioned using the frequent hash value to generate a partitioned build-side row set. The build-side row set is selected from the plurality of row sets. The partitioned build-side row set is distributed to a plurality of hash-join-build (HJB) instances executing at a corresponding plurality of servers.

    Systems and methods for spilling data for hash joins

    公开(公告)号:US11550793B1

    公开(公告)日:2023-01-10

    申请号:US17721599

    申请日:2022-04-15

    Applicant: Snowflake Inc.

    Abstract: Systems and methods for spilling data for hash joins are described. An example method includes determining an amount of available space in a first memory used by a set of relational queries is insufficient for a first relational join query. The first relational join query comprises a join operation. The method also includes determining a set of build memory sizes and a set of probe memory sizes for a set of partitions for the set of relational queries. The method further includes identifying a first partition of the set of partitions based on the set of probe memory sizes and the set of build memory sizes. The method further includes copying the first partition from the first memory to a second memory, wherein the first partition comprises a first build portion and a first probe portion.

    Aggregation operator optimization during query runtime

    公开(公告)号:US11468063B2

    公开(公告)日:2022-10-11

    申请号:US17232821

    申请日:2021-04-16

    Applicant: Snowflake Inc.

    Abstract: The subject technology provides information, corresponding to properties of a build side of a join operation, to a bloom filter. The subject technology, based at least in part on the information from the bloom filter, determines, during executing of a query plan, at least one property of the join operation to determine whether to switch an aggregation operator to a pass through mode, the at least one property comprising at least a reduction rate. The subject technology, switches, in response to the reduction rate being below a threshold value, the aggregation operator to the pass through mode during runtime of the query plan and, while the aggregation operator is in the pass through mode, an input stream of data goes through the aggregation operator without being analyzed and the input stream of data matches an output stream of data flowing out of the aggregation operator.

    SYSTEM AND METHOD FOR DISJUNCTIVE JOINS USING A LOOKUP TABLE

    公开(公告)号:US20210286817A1

    公开(公告)日:2021-09-16

    申请号:US17235826

    申请日:2021-04-20

    Applicant: Snowflake Inc.

    Abstract: Joining data using a disjunctive operator using a lookup table is described. An example computer-implemented method can include receiving a query with a set of conjunctive predicates and a set of disjunctive predicates. The method may also include generating a lookup table for each predicate in the sets of conjunctive predicates and disjunctive predicates. The method, for each row in a probe-side table, may also further include looking up a value associated with that row in each of the lookup tables and adding the row to a results set when there is a match. Additionally, the method may also include returning the results set.

Patent Agency Ranking