LAZY REASSEMBLING OF SEMI-STRUCTURED DATA

    公开(公告)号:US20220358128A1

    公开(公告)日:2022-11-10

    申请号:US17814110

    申请日:2022-07-21

    Applicant: Snowflake Inc.

    Abstract: A pruning index is generated for a source table organized into a set of batch units. The source table comprises a column of semi-structured data. The pruning index comprises a set of filters that index distinct values in each column of the source table. Rather than reassembling an entire tree structure of the semi-structured data prior to indexing, the generating of the pruning index comprises traversing a reassembly hook object that represents a first portion of the semi-structured data that is subcolumnarized and traversing a residual object that represents a second portion of the semi-structured data that is not subcolumnarized. The reassembly hook object is traversed to identify values corresponding to the first portion of the semi-structured data and the residual object is traversed to identify values corresponding to the second portion. The pruning index is stored with an association with the source table.

    INDEX GENERATION USING LAZY REASSEMBLING OF SEMI-STRUCTURED DATA

    公开(公告)号:US20230139194A1

    公开(公告)日:2023-05-04

    申请号:US18146912

    申请日:2022-12-27

    Applicant: Snowflake Inc.

    Abstract: A pruning index is generated for a source table organized into a set of batch units. The source table comprises a column of semi-structured data. The pruning index comprises a set of filters that index distinct values in each column of the source table. Rather than reassembling an entire tree structure of the semi-structured data prior to indexing, the generating of the pruning index comprises traversing a reassembly hook object that represents a first portion of the semi-structured data that is subcolumnarized and traversing a residual object that represents a second portion of the semi-structured data that is not subcolumnarized. The reassembly hook object is traversed to identify values corresponding to the first portion of the semi-structured data and the residual object is traversed to identify values corresponding to the second portion. The pruning index is stored with an association with the source table.

    Lazy reassembling of semi-structured data

    公开(公告)号:US11567939B2

    公开(公告)日:2023-01-31

    申请号:US17814110

    申请日:2022-07-21

    Applicant: Snowflake Inc.

    Abstract: A pruning index is generated for a source table organized into a set of batch units. The source table comprises a column of semi-structured data. The pruning index comprises a set of filters that index distinct values in each column of the source table. Rather than reassembling an entire tree structure of the semi-structured data prior to indexing, the generating of the pruning index comprises traversing a reassembly hook object that represents a first portion of the semi-structured data that is subcolumnarized and traversing a residual object that represents a second portion of the semi-structured data that is not subcolumnarized. The reassembly hook object is traversed to identify values corresponding to the first portion of the semi-structured data and the residual object is traversed to identify values corresponding to the second portion. The pruning index is stored with an association with the source table.

    EFFICIENT DATABASE QUERY EVALUATION
    6.
    发明公开

    公开(公告)号:US20240220456A1

    公开(公告)日:2024-07-04

    申请号:US18607857

    申请日:2024-03-18

    Applicant: Snowflake Inc

    CPC classification number: G06F16/1744 G06F16/221 G06F16/27

    Abstract: Data in a micro-partition of a table is stored in a compressed form. In response to a database query on the table comprising a filter, the portion of the data on which the filter operates is decompressed, without decompressing other portions of the data. Using the filter on the decompressed portion of the data, the portions of the data that are responsive to the filter are determined and decompressed. The responsive data is returned in response to the database query. When a query is run on a table that is compressed using dictionary compression, the uncompressed data may be returned along with the dictionary look-up values. The recipient of the data may use the dictionary look-up values for memoization, reducing the amount of computation required to process the returned data.

    Index generation using lazy reassembling of semi-structured data

    公开(公告)号:US11816107B2

    公开(公告)日:2023-11-14

    申请号:US18146912

    申请日:2022-12-27

    Applicant: Snowflake Inc.

    Abstract: A pruning index is generated for a source table organized into a set of batch units. The source table comprises a column of semi-structured data. The pruning index comprises a set of filters that index distinct values in each column of the source table. Rather than reassembling an entire tree structure of the semi-structured data prior to indexing, the generating of the pruning index comprises traversing a reassembly hook object that represents a first portion of the semi-structured data that is subcolumnarized and traversing a residual object that represents a second portion of the semi-structured data that is not subcolumnarized. The reassembly hook object is traversed to identify values corresponding to the first portion of the semi-structured data and the residual object is traversed to identify values corresponding to the second portion. The pruning index is stored with an association with the source table.

    Efficient database query evaluation

    公开(公告)号:US11971856B2

    公开(公告)日:2024-04-30

    申请号:US16779366

    申请日:2020-01-31

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/1744 G06F16/221 G06F16/27

    Abstract: Data in a micro-partition of a table is stored in a compressed form. In response to a database query on the table comprising a filter, the portion of the data on which the filter operates is decompressed, without decompressing other portions of the data. Using the filter on the decompressed portion of the data, the portions of the data that are responsive to the filter are determined and decompressed. The responsive data is returned in response to the database query. When a query is run on a table that is compressed using dictionary compression, the uncompressed data may be returned along with the dictionary look-up values. The recipient of the data may use the dictionary look-up values for memoization, reducing the amount of computation required to process the returned data.

Patent Agency Ranking