SCHEMA EVOLUTION SUPPORT IN HYBRID TRANSACTIONAL/ANALYTICAL PROCESSING (HTAP) WORKLOADS

    公开(公告)号:US20250068605A1

    公开(公告)日:2025-02-27

    申请号:US18499762

    申请日:2023-11-01

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives a request to perform a table scan operation of a table. The subject technology determines that the table is being accessed for an initial time. The subject technology populates a columnar cache with data of the table provided by the table scan operation. The subject technology determines a set of schema versions of a set of rows from the data of the table. The subject technology determines schema information of each schema from the set of schema versions. The subject technology generates a result rowset and a second rowset comprising a union of columns that have appeared at least once in each row. The subject technology performs deserialization of rows from the result rowset and the second rowset. The subject technology provides the rows from the result rowset and the second rowset to write to a file in a particular format.

    Schema evolution for key columnar data into row-organized sequences

    公开(公告)号:US12135697B2

    公开(公告)日:2024-11-05

    申请号:US18326929

    申请日:2023-05-31

    Applicant: Snowflake Inc.

    Abstract: The subject technology generates, by a compute service manager, a schema hash value for a new schema version associated with a new schema version value, the schema hash value based on determining a sum of hash values of a set of attributes of value columns, the set of attributes comprises a column identifier, and a logical type of a column. The subject technology stores a mapping of the schema hash value to the new schema version value for a table in a metadata database. The subject technology stores a new schema entry based on the schema hash value, the new schema version value, and a new column for the table in the metadata database, the metadata database storing multiple entries for different schema versions, each entry including a particular schema hash value for mapping to a corresponding schema version from the different schema versions.

    SERIALIZATION OF DATA IN A CONCURRENT TRANSACTION PROCESSING DISTRIBUTED DATABASE

    公开(公告)号:US20240020298A1

    公开(公告)日:2024-01-18

    申请号:US18477834

    申请日:2023-09-29

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/2379 G06F16/283 G06F11/1458 G06F16/221

    Abstract: The subject technology serializes, by at least one hardware processor, non-primary key data of column-organized data into compressed serialized value data that is in a row-organized sequence, the compressed serialized value data compressed using at least one bitmap, the non-primary key data comprising a schema identifier, the column-organized data being stored in a columnar database system, the column-organized data comprising primary key data and the non-primary key data. The subject technology stores the compressed serialized value data in a key-value data store of a key-value database system, the key-value database system processing key-value data in a key-value format. The subject technology receives a query by the columnar database system. The subject technology deserializes a portion of the compressed serialized value data that corresponds to the query. The subject technology processes the query using the columnar database system.

    COLUMNAR CACHE IN HYBRID TRANSACTIONAL/ANALYTICAL PROCESSING (HTAP) WORKLOADS

    公开(公告)号:US20250068640A1

    公开(公告)日:2025-02-27

    申请号:US18787807

    申请日:2024-07-29

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives, by an execution node, blob metadata from a key-value store, the blob metadata including information related to a set of blob files. The subject technology determines, by the execution node using the blob metadata, whether a copy of each of the set of blob files is stored in a local cache of the execution node. The subject technology transforms at least one blob file, retrieved from a blob store, to a second file in a column file format, the at least one blob file being in a first format that is different than the column file format, the transforming comprising at least converting a particular snapshot file from the at least one blob file to a particular set of rowsets and writing the set of rowsets into the second file in the column file format. The subject technology stores the second file in the local cache.

    Serialization of data in a concurrent transaction processing distributed database

    公开(公告)号:US12189614B2

    公开(公告)日:2025-01-07

    申请号:US18477834

    申请日:2023-09-29

    Applicant: Snowflake Inc.

    Abstract: The subject technology serializes, by at least one hardware processor, non-primary key data of column-organized data into compressed serialized value data that is in a row-organized sequence, the compressed serialized value data compressed using at least one bitmap, the non-primary key data comprising a schema identifier, the column-organized data being stored in a columnar database system, the column-organized data comprising primary key data and the non-primary key data. The subject technology stores the compressed serialized value data in a key-value data store of a key-value database system, the key-value database system processing key-value data in a key-value format. The subject technology receives a query by the columnar database system. The subject technology deserializes a portion of the compressed serialized value data that corresponds to the query. The subject technology processes the query using the columnar database system.

    SCHEMA EVOLUTION FOR KEY COLUMNAR DATA INTO ROW-ORGANIZED SEQUENCES

    公开(公告)号:US20240028567A1

    公开(公告)日:2024-01-25

    申请号:US18326929

    申请日:2023-05-31

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/213 G06F16/221

    Abstract: The subject technology generates, by a compute service manager, a schema hash value for a new schema version associated with a new schema version value, the schema hash value based on determining a sum of hash values of a set of attributes of value columns, the set of attributes comprises a column identifier, and a logical type of a column. The subject technology stores a mapping of the schema hash value to the new schema version value for a table in a metadata database. The subject technology stores a new schema entry based on the schema hash value, the new schema version value, and a new column for the table in the metadata database, the metadata database storing multiple entries for different schema versions, each entry including a particular schema hash value for mapping to a corresponding schema version from the different schema versions.

Patent Agency Ranking