Index generation using lazy reassembling of semi-structured data

    公开(公告)号:US11816107B2

    公开(公告)日:2023-11-14

    申请号:US18146912

    申请日:2022-12-27

    Applicant: Snowflake Inc.

    Abstract: A pruning index is generated for a source table organized into a set of batch units. The source table comprises a column of semi-structured data. The pruning index comprises a set of filters that index distinct values in each column of the source table. Rather than reassembling an entire tree structure of the semi-structured data prior to indexing, the generating of the pruning index comprises traversing a reassembly hook object that represents a first portion of the semi-structured data that is subcolumnarized and traversing a residual object that represents a second portion of the semi-structured data that is not subcolumnarized. The reassembly hook object is traversed to identify values corresponding to the first portion of the semi-structured data and the residual object is traversed to identify values corresponding to the second portion. The pruning index is stored with an association with the source table.

Patent Agency Ranking