IMPLEMENTING MULTIDIMENSIONAL TWO-SIDED INTERVAL JOINS ON DATA PLATFORMS

    公开(公告)号:US20230085410A1

    公开(公告)日:2023-03-16

    申请号:US18050130

    申请日:2022-10-27

    Applicant: Snowflake Inc.

    Abstract: In an embodiment, a data platform receives a query that includes a two-sided N dimensional interval join of first and second input relations, where N>1. The two-sided N dimensional interval join has an interval-join predicate that, in each of N dimensions, compares intervals determined from the first and second input relations. The data platform implements the interval join at least in part by identifying an intermediate relation that includes all combinations of a row from the first input relation and a row from the second input relation where, in each of the N dimensions, the intervals determined from the first and second input relations both overlap a common N dimensional domain region of an input domain of the first and second input relations. The data platform obtains and returns results of the query.

    Multidimensional two-sided interval joins on hash-equality-join infrastructure

    公开(公告)号:US11494385B2

    公开(公告)日:2022-11-08

    申请号:US17454894

    申请日:2021-11-15

    Applicant: Snowflake Inc.

    Abstract: In an embodiment, a data platform implements a two-sided N dimensional interval join using an N dimensional band join followed by a filter that applies a predicate of the interval join. The data platform generates first and second modified relations from first and second input relations. Each modified relation includes a copy of each row from the corresponding input relation for each input-domain cell that overlaps, in each of N dimensions, a bounding polygon of intervals determined from the row of the corresponding input relation. The data platform inserts, in each row in each modified relation, an input-domain-cell identifier of the corresponding overlapping input-domain cell and uses a hash-equality join that receives the first and second modified relations and that is keyed on the input-domain-cell identifiers. The data platform obtains results of a query by executing a query-execution plan that includes the query-plan section.

    IMPLEMENTING MULTIDIMENSIONAL TWO-SIDED INTERVAL JOINS USING SAMPLING-BASED INPUT-DOMAIN DEMARCATION

    公开(公告)号:US20220300512A1

    公开(公告)日:2022-09-22

    申请号:US17454899

    申请日:2021-11-15

    Applicant: Snowflake Inc.

    Abstract: In an embodiment, a data platform receives a query that includes a two-sided N dimensional interval join of first and second input relations. The data platform samples, with respect to each of one or more of the N dimensions, one or both of the first input relation and the second input relation with respect to an interval size of an interval determined from the input relation. The data platform demarcates the N dimensional input domain into non-overlapping N dimensional input-domain cells based on the sampling. The data platform implements the interval join using a query-execution plan that includes an equality join that is keyed on input-domain-cell identifiers of input-domain cells that at least partially overlap bounding polygons of the intervals determined from the first and second input relations. The equality join is followed in the query-execution plan by a filter that applies the interval join predicate. The data platform obtains results of the query by executing the query-execution plan.

    Implementing multidimensional two-sided interval joins using sampling-based input-domain demarcation

    公开(公告)号:US11537614B2

    公开(公告)日:2022-12-27

    申请号:US17454899

    申请日:2021-11-15

    Applicant: Snowflake Inc.

    Abstract: In an embodiment, a data platform receives a query that includes a two-sided N dimensional interval join of first and second input relations. The data platform samples, with respect to each of one or more of the N dimensions, one or both of the first input relation and the second input relation with respect to an interval size of an interval determined from the input relation. The data platform demarcates the N dimensional input domain into non-overlapping N dimensional input-domain cells based on the sampling. The data platform implements the interval join using a query-execution plan that includes an equality join that is keyed on input-domain-cell identifiers of input-domain cells that at least partially overlap bounding polygons of the intervals determined from the first and second input relations. The equality join is followed in the query-execution plan by a filter that applies the interval-join predicate. The data platform obtains results of the query by executing the query-execution plan.

    PRE-FILTER DEDUPLICATION FOR MULTIDIMENSIONAL TWO-SIDED INTERVAL JOINS

    公开(公告)号:US20220300511A1

    公开(公告)日:2022-09-22

    申请号:US17239529

    申请日:2021-04-23

    Applicant: Snowflake Inc.

    Abstract: Disclosed herein are systems and methods for pre-filter deduplication for multidimensional two-sided interval joins. In an embodiment, a data platform receives query instructions for a two-sided N dimensional interval join, where N is an integer greater than 1. The two-sided N dimensional interval join has an interval-join predicate that compares intervals determined from the input relations in each of N dimensions. The data platform implements the two-sided N dimensional interval join as a query-plan section that includes an N dimensional band join that is followed by a deduplication operator that is followed by a filter that applies the interval-join predicate. The N dimensional band join includes a hash join keyed to N dimensional domain cells overlapped at least in part by intervals determined from the input relations in each of the N dimensions. The deduplication operator removes duplicate rows from a potential-duplicates subset of the output of the N dimensional band join.

    Multidimensional and multi-relation sampling for implementing multidimensional two-sided interval joins

    公开(公告)号:US11194808B1

    公开(公告)日:2021-12-07

    申请号:US17239521

    申请日:2021-04-23

    Applicant: Snowflake Inc.

    Abstract: Disclosed herein are systems and methods for multidimensional and multi-relation sampling for implementing multidimensional two-sided interval joins. In an embodiment, a data platform receives query instructions for a two-sided N dimensional interval join, where N is an integer greater than 1. The two-sided N dimensional interval join has an interval-join predicate that compares intervals determined from the input relations in each of N dimensions. The data platform samples interval sizes in one or more input relations, and demarcates an N dimensional input domain based on the sampling. The data platform implements the two-sided N dimensional interval join using an N dimensional band join followed by a filter that applies the interval-join predicate. The N dimensional band join includes a hash join keyed to N dimensional domain cells overlapped at least in part by intervals in the input relations in each of the N dimensions.

    Pre-filter deduplication for multidimensional two-sided interval joins

    公开(公告)号:US11494379B2

    公开(公告)日:2022-11-08

    申请号:US17239529

    申请日:2021-04-23

    Applicant: Snowflake Inc.

    Abstract: Disclosed herein are systems and methods for pre-filter deduplication for multidimensional two-sided interval joins. In an embodiment, a data platform receives query instructions for a two-sided N dimensional interval join, where N is an integer greater than 1. The two-sided N dimensional interval join has an interval-join predicate that compares intervals determined from the input relations in each of N dimensions. The data platform implements the two-sided N dimensional interval join as a query-plan section that includes an N dimensional band join that is followed by a deduplication operator that is followed by a filter that applies the interval-join predicate. The N dimensional band join includes a hash join keyed to N dimensional domain cells overlapped at least in part by intervals determined from the input relations in each of the N dimensions. The deduplication operator removes duplicate rows from a potential-duplicates subset of the output of the N dimensional band join.

    Multidimensional two-sided interval joins on distributed hash-based-equality-join infrastructure

    公开(公告)号:US11216464B1

    公开(公告)日:2022-01-04

    申请号:US17239515

    申请日:2021-04-23

    Applicant: Snowflake Inc.

    Abstract: Disclosed herein are systems and methods for implementing multidimensional two-sided interval joins on a distributed hash-based-equality-join infrastructure. In an embodiment, a data platform receives, for a query on a database, query instructions that include a two-sided N-dimensional interval join of a first input relation and a second input relation, where N is an integer greater than 1. The two-sided N-dimensional interval join has an interval-join predicate that, in each of N dimensions, compares an interval determined from the first input relation with an interval determined from the second input relation. The data platform generates a query-execution plan that implements the two-sided N-dimensional interval join as a query-plan section that includes an N-dimensional band join followed by a filter that applies the interval-join predicate to a band-join output of the N-dimensional band join. The data platform obtains results of the query at least in part by executing the query-execution plan.

Patent Agency Ranking