-
公开(公告)号:US12032995B1
公开(公告)日:2024-07-09
申请号:US18361549
申请日:2023-07-28
Applicant: Snowflake Inc.
Inventor: Gabriel Kliot , Ruji Xie , Subramanian Muralidhar , William Waddington
IPC: G06F9/48
CPC classification number: G06F9/4881
Abstract: A method includes decoding, by at least one hardware processor, an enqueue request received from a data process of a database system. The enqueue request includes a task. The task is enqueued in an in-memory task queue. An enqueue acknowledgment is encoded for transmission to the data process responsive to the enqueue request. The task is persisted in a storage location associated with the in-memory task queue. Initiate a lease of the task to a worker node in response to a lease request received from the worker node. A dequeue request is received from the worker node where the dequeue request indicates completion of the task by the worker node. The task is dequeued from the in-memory task queue based on the dequeue request.
-
公开(公告)号:US20240028567A1
公开(公告)日:2024-01-25
申请号:US18326929
申请日:2023-05-31
Applicant: Snowflake Inc.
Inventor: Benoit Dageville , Adrian Hamza , Lishi Jiang , William Waddington , Khaled Yagoub , Wumengjian Zhu
CPC classification number: G06F16/213 , G06F16/221
Abstract: The subject technology generates, by a compute service manager, a schema hash value for a new schema version associated with a new schema version value, the schema hash value based on determining a sum of hash values of a set of attributes of value columns, the set of attributes comprises a column identifier, and a logical type of a column. The subject technology stores a mapping of the schema hash value to the new schema version value for a table in a metadata database. The subject technology stores a new schema entry based on the schema hash value, the new schema version value, and a new column for the table in the metadata database, the metadata database storing multiple entries for different schema versions, each entry including a particular schema hash value for mapping to a corresponding schema version from the different schema versions.
-
公开(公告)号:US11809414B2
公开(公告)日:2023-11-07
申请号:US17538818
申请日:2021-11-30
Applicant: Snowflake Inc.
Inventor: Khaled Yagoub , Wumengjian Zhu , Benoit Dageville , William Waddington
CPC classification number: G06F16/2379 , G06F11/1458 , G06F16/221 , G06F16/283
Abstract: A distributed database system can implement a column-based database system and a row-based database system for processing data. The row-based database system can store data organized into key value pairs, and data to be processed by the row-based database system is converted to a key-value format compressing keys that correspond to values. The distributed database system can perform serialization and compression in converting the data to the key-value format for efficient data storage performance. The distributed database system can unpack portions of the converted serialized compressed data in response to queries that process a portion of serialized compressed data without unpacking the entire converted dataset.
-
34.
公开(公告)号:US11709808B1
公开(公告)日:2023-07-25
申请号:US17656558
申请日:2022-03-25
Applicant: Snowflake Inc.
Inventor: Benoit Dageville , Adrian Hamza , William Waddington , Khaled Yagoub , Wumengjian Zhu , Lishi Jiang
CPC classification number: G06F16/213 , G06F16/221
Abstract: The subject technology receives a statement to perform an operation to add a new column into a table. The subject technology generates a schema hash value for a new schema version associated with a new schema version value. The subject technology stores a mapping of the schema hash value to the new schema version value for the table in a metadata database. The subject technology stores a new schema entry based on the schema hash value, the new schema version value, and the new column for the table in the metadata database. The subject technology performs an operation to add the new column to the table.
-
公开(公告)号:US11675806B2
公开(公告)日:2023-06-13
申请号:US17249598
申请日:2021-03-05
Applicant: Snowflake Inc.
Inventor: Leonidas Galanis , Alexander Miller , William Waddington , Khaled Yagoub
IPC: G06F16/30 , G06F16/25 , G06F16/2452 , G06F16/28 , G06F16/2455 , G06F16/27
CPC classification number: G06F16/258 , G06F16/24524 , G06F16/24564 , G06F16/256 , G06F16/27 , G06F16/283
Abstract: A hybrid network-based database system for handling OLTP and OLAP queries using decoupled compute and storage devices. A set of decoupled compute instances perform transactions on an OLTP database, and the data is replicated to an OLAP database, which is managed by another set of decoupled compute instances. Further, in response to queries, the database system can retrieve data from the OLTP and OLAP database for merging and processing according to the query.
-
公开(公告)号:US20230169068A1
公开(公告)日:2023-06-01
申请号:US17538818
申请日:2021-11-30
Applicant: Snowflake Inc.
Inventor: Khaled Yagoub , Wumengjian Zhu , Benoit Dageville , William Waddington
CPC classification number: G06F16/2379 , G06F16/221 , G06F11/1458 , G06F16/283
Abstract: A distributed database system can implement a column-based database system and a row-based database system for processing data. The row-based database system can store data organized into key value pairs, and data to be processed by the row-based database system is converted to a key-value format compressing keys that correspond to values. The distributed database system can perform serialization and compression in converting the data to the key-value format for efficient data storage performance. The distributed database system can unpack portions of the converted serialized compressed data in response to queries that process a portion of serialized compressed data without unpacking the entire converted dataset.
-
公开(公告)号:US20220197886A1
公开(公告)日:2022-06-23
申请号:US17654296
申请日:2022-03-10
Applicant: Snowflake Inc.
Inventor: Jiaqi Yan , Thierry Cruanes , Jeffrey Rosen , William Waddington , Prasanna Rajaperumal , Abdul Munir
Abstract: Disclosed herein are embodiments of systems and methods for selecting partitions for reclustering based on distribution of overlapping partitions. In an example, a database platform makes a determination to at least partially recluster a database table that includes data stored across a plurality of partitions. The database platform responsively selects a subset of the partitions. The selecting of the subset includes identifying a point on a domain of a clustering key that corresponds to a local maximum of overlapping partitions, and also includes selecting the subset from among a group of overlapping partitions. The group includes at least one partition that overlaps the identified point on the domain of the clustering key. Each partition in the selected subset is above a reduction goal of overlapping partitions. The database platform at least partially reclusters the selected subset based on the clustering key.
-
公开(公告)号:US20220092051A1
公开(公告)日:2022-03-24
申请号:US17163034
申请日:2021-01-29
Applicant: Snowflake Inc
Inventor: Alexander Miller , William Waddington
IPC: G06F16/23 , G06F16/22 , G06F16/2455 , G06F16/248
Abstract: The subject technology receives a first transaction. The subject technology assigns a first read version to the first transaction, the first read version indicating a first version of the linearizable storage. The subject technology performs a read operation from the first transaction on a table in a database. The subject technology determines a first commit version identifier corresponding to first data resulting from the read operation. The subject technology, in response to determining that a particular write operation is absent from the first transaction, proceeding to execute a different transaction and foregoing to perform a commit process in connection with the first transaction.
-
公开(公告)号:US20220004552A1
公开(公告)日:2022-01-06
申请号:US17477663
申请日:2021-09-17
Applicant: Snowflake Inc.
Inventor: Jeffrey Rosen , Abdul Munir , Jiaqi Yan , William Waddington , Prasanna Rajaperumal , Thierry Cruanes
IPC: G06F16/2458 , G06F16/2453 , G06F9/50 , G06F16/2455
Abstract: Resource provisioning systems and methods are described. In an embodiment, a system includes a plurality of shared storage devices collectively storing database data, an execution platform, and a compute service manager. The compute service manager is configured to determine a task to be executed in response to a trigger event and determine a query plan for executing the task, wherein the query plan comprises a plurality of discrete subtasks. The compute service manager is further configured to assign the plurality of discrete subtasks to one or more nodes of a plurality of nodes of the execution platform, determine whether execution of the task is complete, and in response to determining the execution of the task is complete, store a record in the plurality of shared storage devices indicating the task was completed.
-
公开(公告)号:US11163746B2
公开(公告)日:2021-11-02
申请号:US17249796
申请日:2021-03-12
Applicant: Snowflake Inc.
Inventor: Jiaqi Yan , Thierry Cruanes , Jeffrey Rosen , William Waddington , Prasanna Rajaperumal , Abdul Munir
Abstract: The subject technology determines whether a table is sufficiently clustered. The subject technology in response to determining the table is not sufficiently clustered, selects one or more micro-partitions of the table to be reclustered. The subject technology constructs a data structure for the table. The subject technology extracts minimum and maximum endpoints for each micro-partition in the data structure. The subject technology sorts each of one or more peaks in the data structure based on height. The subject technology sorts overlapping micro-partitions based on width. The subject technology selects based on which micro-partitions are within the tallest peaks of the one or more peaks and further based on which of the overlapping micro-partitions have the widest widths.
-
-
-
-
-
-
-
-
-