-
公开(公告)号:US20240330300A1
公开(公告)日:2024-10-03
申请号:US18738252
申请日:2024-06-10
Applicant: Snowflake Inc.
Inventor: Matthias Carl Adams , Mahmud Allahverdiyev , Ismail Oukid , Peter Popov , Alejandro Salinger
IPC: G06F16/2455 , G06F16/22 , G06F16/28 , G06F16/9035 , G06F17/18
CPC classification number: G06F16/24557 , G06F16/2272 , G06F16/283 , G06F16/9035 , G06F17/18
Abstract: A method to perform an indexed geospatial search includes retrieving, by at least one hardware processor, a query specifying a geography data column and a constant geography object. A first plurality of hash functions of a first set of cells covering a surface associated with the geography data column is determined. A search index of a database including the geography data column is updated based on the first plurality of hash functions to obtain an updated search index. The query is executed on a reduced scan set of the database. The reduced scan set is based on the updated search index.
-
公开(公告)号:US11989184B2
公开(公告)日:2024-05-21
申请号:US18305993
申请日:2023-04-24
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Ismail Oukid , Stefan Richter , Alejandro Salinger
IPC: G06F16/24 , G06F16/22 , G06F16/2455 , G06F16/28 , G06F16/9035
CPC classification number: G06F16/24557 , G06F16/2272 , G06F16/283 , G06F16/9035
Abstract: A query directed at a source table organized into a set of batch units is received. The query comprises a regular expression search pattern. The regular expression search pattern is converted to a pruning index predicate comprising a set of substring literals extracted from the regular expression search pattern. A set of N-grams is generated based on the set of substring literals extracted from the regular expression search pattern. A pruning index associated with the source table is accessed. The pruning index indexes distinct N-grams in each column of the source table. A subset of batch units to scan for data matching the query are identified based on the pruning index and the set of N-grams. The query is processed by scanning the subset of batch units.
-
公开(公告)号:US11816107B2
公开(公告)日:2023-11-14
申请号:US18146912
申请日:2022-12-27
Applicant: Snowflake Inc.
Inventor: Mahmud Allahverdiyev , Selcuk Aya , Bowei Chen , Ismail Oukid
IPC: G06F16/24 , G06F16/2455 , G06F16/9035 , G06F16/28 , G06F17/18 , G06F16/22
CPC classification number: G06F16/24557 , G06F16/2272 , G06F16/283 , G06F16/9035 , G06F17/18
Abstract: A pruning index is generated for a source table organized into a set of batch units. The source table comprises a column of semi-structured data. The pruning index comprises a set of filters that index distinct values in each column of the source table. Rather than reassembling an entire tree structure of the semi-structured data prior to indexing, the generating of the pruning index comprises traversing a reassembly hook object that represents a first portion of the semi-structured data that is subcolumnarized and traversing a residual object that represents a second portion of the semi-structured data that is not subcolumnarized. The reassembly hook object is traversed to identify values corresponding to the first portion of the semi-structured data and the residual object is traversed to identify values corresponding to the second portion. The pruning index is stored with an association with the source table.
-
公开(公告)号:US20230342362A1
公开(公告)日:2023-10-26
申请号:US18305993
申请日:2023-04-24
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Ismail Oukid , Stefan Richter , Alejandro Salinger
IPC: G06F16/2455 , G06F16/9035 , G06F16/28 , G06F16/22
CPC classification number: G06F16/24557 , G06F16/9035 , G06F16/283 , G06F16/2272
Abstract: A query directed at a source table organized into a set of batch units is received. The query comprises a regular expression search pattern. The regular expression search pattern is converted to a pruning index predicate comprising a set of substring literals extracted from the regular expression search pattern. A set of N-grams is generated based on the set of substring literals extracted from the regular expression search pattern. A pruning index associated with the source table is accessed. The pruning index indexes distinct N-grams in each column of the source table. A subset of batch units to scan for data matching the query are identified based on the pruning index and the set of N-grams. The query is processed by scanning the subset of batch units.
-
公开(公告)号:US20230064151A1
公开(公告)日:2023-03-02
申请号:US18047595
申请日:2022-10-18
Applicant: Snowflake Inc.
Inventor: Mahmud Allahverdiyev , Thierry Cruanes , Ismail Oukid , Stefan Richter
IPC: G06F16/2455 , G06F16/9035 , G06F16/28 , G06F17/18 , G06F16/22
Abstract: A source table organized into a set of batch units is accessed. The source table comprises a column of data corresponding to a semi-structured data type. One or more indexing transformations for an object in the column are generated. The generating of the one or more indexing transformation includes converting the object to one or more stored data types. A pruning index is generated for the source table based in part on the one or more indexing transformations. The pruning index comprises a set of filters that index distinct values in each column of the source table, and each filter corresponds to a batch unit in the set of batch units. The pruning index is stored in a database with an association with the source table.
-
公开(公告)号:US11372860B2
公开(公告)日:2022-06-28
申请号:US17462796
申请日:2021-08-31
Applicant: Snowflake Inc.
Inventor: Max Heimel , Ismail Oukid , Linnea Passing , Stefan Richter , Juliane K. Waack
IPC: G06F16/24 , G06F16/2455 , G06F16/9035 , G06F16/28 , G06F17/18 , G06F16/22
Abstract: A query directed at a table organized into a set of batch units is received. The query comprises a predicate for which values are unknown prior to runtime. A set of values for the predicate are determined based on the query. An index access plan is created based on the set of values. Based on the index access plan, the set of batch units are pruned using a pruning index associated with the table. The pruning index comprises a set of filters that index distinct values in each column of the table. The pruning of the set of batch units comprises identifying a subset of batch units to scan for data that satisfies the query. The subset of batch units of the table are scanned to identify data that satisfies the query.
-
公开(公告)号:US11016975B1
公开(公告)日:2021-05-25
申请号:US17086239
申请日:2020-10-30
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Benoit Dageville , Ismail Oukid , Stefan Richter
IPC: G06F16/24 , G06F16/2455 , G06F16/9035 , G06F16/22 , G06F17/18 , G06F16/28
Abstract: A query directed at a source table organized into a set of batch units is received. The query includes a pattern matching predicate that specifies a search pattern. A set of N-grams are generated based on the search pattern. A pruning index is used to identify a subset of batch units to scan for matching data based on the set of N-grams generated for the search pattern. The pruning index indexes distinct N-grams in the source table. The query is processed by scanning the subset of batch units.
-
公开(公告)号:US12050605B2
公开(公告)日:2024-07-30
申请号:US17804248
申请日:2022-05-26
Applicant: Snowflake Inc.
Inventor: Matthias Carl Adams , Mahmud Allahverdiyev , Ismail Oukid , Peter Popov , Alejandro Salinger
IPC: G06F16/22 , G06F16/2455 , G06F16/28 , G06F16/9035 , G06F17/18
CPC classification number: G06F16/24557 , G06F16/2272 , G06F16/283 , G06F16/9035 , G06F17/18
Abstract: Provided herein are systems and methods for indexed geospatial predicate search. An example method performed by at least one hardware processor includes decoding a query with a geospatial predicate. The geospatial predicate is configured between a geography data column and a constant geography object. The method further includes computing a first covering for a data value of a plurality of data values in the geography data column. The first covering includes a first set of cells in a hierarchical grid representation of a geography. The first set of cells represents a surface of the geography associated with the data value. A second covering is computed for the constant geography object. A determination is made on whether to prune at least one partition of a database organized into a set of partitions and including the geography data column based on a comparison between the first covering and the second covering.
-
公开(公告)号:US20230139194A1
公开(公告)日:2023-05-04
申请号:US18146912
申请日:2022-12-27
Applicant: Snowflake Inc.
Inventor: Mahmud Allahverdiyev , Selcuk Aya , Bowei Chen , Ismail Oukid
IPC: G06F16/2455 , G06F16/9035 , G06F16/28 , G06F17/18 , G06F16/22
Abstract: A pruning index is generated for a source table organized into a set of batch units. The source table comprises a column of semi-structured data. The pruning index comprises a set of filters that index distinct values in each column of the source table. Rather than reassembling an entire tree structure of the semi-structured data prior to indexing, the generating of the pruning index comprises traversing a reassembly hook object that represents a first portion of the semi-structured data that is subcolumnarized and traversing a residual object that represents a second portion of the semi-structured data that is not subcolumnarized. The reassembly hook object is traversed to identify values corresponding to the first portion of the semi-structured data and the residual object is traversed to identify values corresponding to the second portion. The pruning index is stored with an association with the source table.
-
公开(公告)号:US11567939B2
公开(公告)日:2023-01-31
申请号:US17814110
申请日:2022-07-21
Applicant: Snowflake Inc.
Inventor: Mahmud Allahverdiyev , Selcuk Aya , Bowei Chen , Ismail Oukid
IPC: G06F16/24 , G06F16/2455 , G06F16/9035 , G06F16/28 , G06F17/18 , G06F16/22
Abstract: A pruning index is generated for a source table organized into a set of batch units. The source table comprises a column of semi-structured data. The pruning index comprises a set of filters that index distinct values in each column of the source table. Rather than reassembling an entire tree structure of the semi-structured data prior to indexing, the generating of the pruning index comprises traversing a reassembly hook object that represents a first portion of the semi-structured data that is subcolumnarized and traversing a residual object that represents a second portion of the semi-structured data that is not subcolumnarized. The reassembly hook object is traversed to identify values corresponding to the first portion of the semi-structured data and the residual object is traversed to identify values corresponding to the second portion. The pruning index is stored with an association with the source table.
-
-
-
-
-
-
-
-
-