JOIN QUERY PROCESSING USING PRUNING INDEX

    公开(公告)号:US20220292098A1

    公开(公告)日:2022-09-15

    申请号:US17804630

    申请日:2022-05-31

    Applicant: Snowflake Inc.

    Abstract: A query directed at a table organized into a set of batch units is received. The query comprises a predicate for which values are unknown prior to runtime. A set of values for the predicate are determined based on the query. An index access plan is created based on the set of values. Based on the index access plan, the set of batch units are pruned using a pruning index associated with the table. The pruning index comprises a set of filters that index distinct values in each column of the table. The pruning of the set of batch units comprises identifying a subset of batch units to scan for data that satisfies the query. The subset of batch units of the table are scanned to identify data that satisfies the query.

    Prefix N-gram indexing
    12.
    发明授权

    公开(公告)号:US11275738B2

    公开(公告)日:2022-03-15

    申请号:US17484817

    申请日:2021-09-24

    Applicant: Snowflake Inc.

    Abstract: A table organized into a set of batch units is accessed. A set of N-grams are generated for a data value in the source table. The set of N-grams include a first N-gram of a first length and a second N-gram of a second length where the first N-gram corresponds to a prefix of the second N-gram. A set of fingerprints are generated for the data value based on the set of N-grams. The set of fingerprints include a first fingerprint generated based on the first N-gram and a second fingerprint generated based on the second N-gram and the first fingerprint. A pruning index that indexes distinct values in each column of the source table is generated based on the set of fingerprints and stored in a database with an association with the source table.

    PRUNING INDEX GENERATION FOR PATTERN MATCHING QUERIES

    公开(公告)号:US20210357411A1

    公开(公告)日:2021-11-18

    申请号:US17388160

    申请日:2021-07-29

    Applicant: Snowflake Inc.

    Abstract: A query directed at a source table organized into a set of batch units is received. The query includes a pattern matching predicate that specifies a search pattern. A set of N-grams are generated based on the search pattern. A pruning index associated with the source table is accessed. The pruning index comprises a set of filters that index distinct N-grams in each column of the source table. The pruning index is used to identify a subset of batch units to scan for matching data based on the set of N-grams generated for the search pattern. The query is processed by scanning the subset of batch units.

    Database query processing using a pruning index

    公开(公告)号:US11086875B2

    公开(公告)日:2021-08-10

    申请号:US17161115

    申请日:2021-01-28

    Applicant: Snowflake Inc.

    Abstract: A source table organized into a set of micro-partitions is accessed by a network-based data warehouse. A pruning index is generated based on the source table. The pruning index comprises a set of filters that indicate locations of distinct values in each column of the source table. A query directed at the source table is received at the network-based data warehouse. The query is processed using the pruning index. The processing of the query comprises pruning the set of micro-partitions of the source table to scan for data matching the query, the pruning of the plurality of micro-partitions comprising identifying, using the pruning index, a sub-set of micro-partitions to scan for the data matching the query.

    Pruning indexes to enhance database query processing

    公开(公告)号:US10769150B1

    公开(公告)日:2020-09-08

    申请号:US16727315

    申请日:2019-12-26

    Applicant: Snowflake Inc.

    Abstract: A source table organized into a set of micro-partitions is accessed by a network-based data warehouse. A pruning index is generated based on the source table. The pruning index comprises a set of filters that indicate locations of distinct values in each column of the source table. A query directed at the source table is received at the network-based data warehouse. The query is processed using the pruning index. The processing of the query comprises pruning the set of micro-partitions of the source table to scan for data matching the query, the pruning of the plurality of micro-partitions comprising identifying, using the pruning index, a sub-set of micro-partitions to scan for the data matching the query.

    PRUNING TECHNIQUES FOR PROCESSING TOP K QUERIES

    公开(公告)号:US20240168953A1

    公开(公告)日:2024-05-23

    申请号:US18534382

    申请日:2023-12-08

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/24557 G06F16/24578

    Abstract: A top K query directed at a table is received. The table is organized into multiple storage units. The top K query comprises a first clause to sort a result set in order and a second clause that specifies a limit on a number of results provided in response to the query. A table scan operator identifies a first set of rows from the table based on a scan set determined for the table and provides the first set of rows to a top K operator. The top K operator determines a current boundary based on the first set of rows and provides the current boundary to the table scan operator. The table scan operator prunes the scan set based on the current boundary and identifies a second set of rows from the table based on the pruning.

    Indexed regular expression search with N-grams

    公开(公告)号:US11681708B2

    公开(公告)日:2023-06-20

    申请号:US17934977

    申请日:2022-09-23

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/24557 G06F16/2272 G06F16/283 G06F16/9035

    Abstract: A query directed at a source table organized into a set of batch units is received. The query comprises a regular expression search pattern. The regular expression search pattern is converted to a pruning index predicate comprising a set of substring literals extracted from the regular expression search pattern. A set of N-grams is generated based on the set of substring literals extracted from the regular expression search pattern. A pruning index associated with the source table is accessed. The pruning index indexes distinct N-grams in each column of the source table. A subset of batch units to scan for data matching the query are identified based on the pruning index and the set of N-grams. The query is processed by scanning the subset of batch units.

    INDEXED REGULAR EXPRESSION SEARCH WITH N-GRAMS

    公开(公告)号:US20230084069A1

    公开(公告)日:2023-03-16

    申请号:US17934977

    申请日:2022-09-23

    Applicant: Snowflake Inc.

    Abstract: A query directed at a source table organized into a set of batch units is received. The query comprises a regular expression search pattern. The regular expression search pattern is converted to a pruning index predicate comprising a set of substring literals extracted from the regular expression search pattern. A set of N-grams is generated based on the set of substring literals extracted from the regular expression search pattern. A pruning index associated with the source table is accessed. The pruning index indexes distinct N-grams in each column of the source table. A subset of batch units to scan for data matching the query are identified based on the pruning index and the set of N-grams. The query is processed by scanning the subset of batch units.

Patent Agency Ranking