SCHEMA EVOLUTION SUPPORT IN HYBRID TRANSACTIONAL/ANALYTICAL PROCESSING (HTAP) WORKLOADS

    公开(公告)号:US20250068605A1

    公开(公告)日:2025-02-27

    申请号:US18499762

    申请日:2023-11-01

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives a request to perform a table scan operation of a table. The subject technology determines that the table is being accessed for an initial time. The subject technology populates a columnar cache with data of the table provided by the table scan operation. The subject technology determines a set of schema versions of a set of rows from the data of the table. The subject technology determines schema information of each schema from the set of schema versions. The subject technology generates a result rowset and a second rowset comprising a union of columns that have appeared at least once in each row. The subject technology performs deserialization of rows from the result rowset and the second rowset. The subject technology provides the rows from the result rowset and the second rowset to write to a file in a particular format.

    Deadlock detection in distributed databases

    公开(公告)号:US11809916B2

    公开(公告)日:2023-11-07

    申请号:US17647752

    申请日:2022-01-12

    Applicant: Snowflake Inc.

    CPC classification number: G06F9/524 G06F16/2343 G06F16/2379 G06F16/256

    Abstract: The subject technology performs a locking operation on a first set of keys by a first statement of a first transaction. The subject technology determines that a conflict occurred between the first statement and a second transaction. The subject technology determines that the second transaction has yet to complete after a predetermined period of time. The subject technology performs a deadlock detection process where the subject technology stores a key and value in a table indicating the first transaction and the second transaction, detects, based at least in part on a graph traversal of the table starting from the first transaction, a cycle between the first transaction and the second transaction, and determines that the first transaction is a youngest transaction in the detected cycle. The subject technology ceases execution of the first transaction in response to the first transaction being a youngest transaction in a detected cycle.

    CONCURRENCY CONTROL FOR TRANSACTIONS IN DATABASE SYSTEMS

    公开(公告)号:US20220405266A1

    公开(公告)日:2022-12-22

    申请号:US17821581

    申请日:2022-08-23

    Applicant: Snowflake Inc.

    Abstract: The subject technology inserts, by a first transaction, a new version of an object, the first transaction including a first statement to perform an update operation to a row in a first table, the object corresponding to data in the row to be updated, the first statement including information comprising an object key associated with the object. The subject technology performs, by a second transaction, a range read, the range read including information indicating the object key. The subject technology receives a set of conflicting transactions from the range read. The subject technology determines that a conflict occurred between the first transaction and a third transaction from the set of conflicting transactions. The subject technology performs a restart of the first transaction in response to determining that the conflict occurred.

    COLUMNAR CACHE IN HYBRID TRANSACTIONAL/ANALYTICAL PROCESSING (HTAP) WORKLOADS

    公开(公告)号:US20250068640A1

    公开(公告)日:2025-02-27

    申请号:US18787807

    申请日:2024-07-29

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives, by an execution node, blob metadata from a key-value store, the blob metadata including information related to a set of blob files. The subject technology determines, by the execution node using the blob metadata, whether a copy of each of the set of blob files is stored in a local cache of the execution node. The subject technology transforms at least one blob file, retrieved from a blob store, to a second file in a column file format, the at least one blob file being in a first format that is different than the column file format, the transforming comprising at least converting a particular snapshot file from the at least one blob file to a particular set of rowsets and writing the set of rowsets into the second file in the column file format. The subject technology stores the second file in the local cache.

    Multi database queries
    5.
    发明授权

    公开(公告)号:US12235843B2

    公开(公告)日:2025-02-25

    申请号:US18656008

    申请日:2024-05-06

    Applicant: Snowflake Inc.

    Abstract: Techniques for multi database query processing are described. Objects located in a plurality of databases referenced in a query can be compiled. A connection string based on the compiled objects can be generated. The connection string can include mapping information related to the related to the plurality of databases and cluster information of where the plurality of databases are stored in the network-based data system. The connection string can then be included in a query plan to allow for execution of the query plan using the connection string to access the objects in the plurality of databases.

    AUTOMATED TRACKING OF OLDEST RUNNING STATEMENT IN DISTRIBUTED MVCC DATABASES

    公开(公告)号:US20240394263A1

    公开(公告)日:2024-11-28

    申请号:US18324669

    申请日:2023-05-26

    Applicant: Snowflake Inc.

    Abstract: The subject technology initializes a statement for execution. The subject technology determines that the statement has been executing for longer than a minimum statement timeout. The subject technology periodically updates a read timestamp table with a new update timestamp for an entry corresponding to the statement. The subject technology determines whether the entry corresponding to the statement has been removed from the read timestamp table. The subject technology, in response to determining that the entry has not been removed from the read timestamp table, removes the entry from the read timestamp table. The subject technology provides a set of results from completing execution of the statement.

    Scan-based merge for analytical query processing in HTAP systems using delete vectors

    公开(公告)号:US12135700B1

    公开(公告)日:2024-11-05

    申请号:US18460206

    申请日:2023-09-01

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives a query, the query including a query range for processing the query and a set of requested columns. The subject technology based on the query range, determining a set of blob files and a set of delete vectors. The subject technology for each blob file, storing each row, including the set of request columns, into an array of rowsets. The subject technology for each rowset, generating a delete bitset to at least indicate whether each row has been deleted. The subject technology for each delta file, indicate a previous row of a visible row of the delta file as being deleted based on a delete pointer of the visible row. The subject technology providing a set of rowsets, including a corresponding selection column set, as a result of the query.

    Multi database queries
    9.
    发明授权

    公开(公告)号:US12007993B1

    公开(公告)日:2024-06-11

    申请号:US18345900

    申请日:2023-06-30

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/24542 G06F16/2329 G06F16/24535 G06F16/285

    Abstract: Techniques for multi database query processing are described. Objects located in a plurality of databases referenced in a query can be compiled. A connection string based on the compiled objects can be generated. The connection string can include mapping information related to the related to the plurality of databases and cluster information of where the plurality of databases are stored in the network-based data system. The connection string can then be included in a query plan to allow for execution of the query plan using the connection string to access the objects in the plurality of databases.

    OPTIMIZATIONS FOR LONG-LIVED STATEMENTS IN A DATABASE SYSTEM

    公开(公告)号:US20230244655A1

    公开(公告)日:2023-08-03

    申请号:US17649737

    申请日:2022-02-02

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/2379 G06F16/2365 G06F9/524

    Abstract: The subject technology performs a search for a key in a regular space to locate a first visible version of the key. The subject technology determines that the first visible version of the key is not one of a N number of newest versions of the key. The subject technology performs a search of an undo space to locate a second visible version of the key. The subject technology determines whether the first visible version or the second visible version of the key is newer. The subject technology provides a newer version of the key between the first visible version and the second visible version of the key.

Patent Agency Ranking