Low latency ingestion into a data system

    公开(公告)号:US11487788B1

    公开(公告)日:2022-11-01

    申请号:US17648228

    申请日:2022-01-18

    Applicant: Snowflake Inc.

    Abstract: Described herein are techniques for improving transfer of metadata from a metadata database to a database stored in a data system, such as a data warehouse. The metadata may be written into the metadata database with a version stamp, which is monotonic increasing register value, and a partition identifier, which can be generated using attribute values of the metadata. A plurality of readers can scan the metadata database based on version stamp and partition identifier values to export the metadata to a cloud storage location. From the cloud storage location, the exported data can be auto ingested into the database, which includes a journal and snapshot table.

    Managed event tables in a database system

    公开(公告)号:US11487778B1

    公开(公告)日:2022-11-01

    申请号:US17649571

    申请日:2022-02-01

    Applicant: Snowflake Inc.

    Abstract: Provided herein are systems and methods for configuring managed event tables. A system includes at least one hardware processor coupled to a memory and configured to collect using an event table instance within a sandbox process, log data generated by a UDF during execution of the UDF code. The log data is provided from the sandbox process to an execution node process of the execution node. The log data is formatted using the execution node process, to generate formatted log data. The formatting is based on a configuration of a managed event table that is external to the execution node. The at least one hardware processor further causes ingestion of the formatted log data from the execution node process into the managed event table.

    DATA OVERLAP COUNT ADJUSTMENT IN A MULTIPLE TENANT DATABASE SYSTEM

    公开(公告)号:US20220327232A1

    公开(公告)日:2022-10-13

    申请号:US17847681

    申请日:2022-06-23

    Applicant: SNOWFLAKE INC.

    Abstract: Systems, methods, and devices for generating a secure join of database data are disclosed. A method creates a secure view of datapoints of a consumer account and processes, using a secure user defined function (UDF), the datapoints of the consumer account and datapoints of a provider account to generate a secure join key. The datapoints of the consumer account are provided to the secure UDF using the secure view. The method further performs, by a processor, an analysis of the datapoints of the consumer account and the datapoints of the provider account of the secure join key. The analysis returns a count value of overlapping datapoints between the consumer account and the provider account. The method further adjusts the count value of overlapping datapoints based on a number of distinct rows associated with the provider account, and provides the adjusted count value of overlapping datapoints to the consumer account.

    TABLE DATA PROCESSING USING A CHANGE TRACKING COLUMN

    公开(公告)号:US20220327107A1

    公开(公告)日:2022-10-13

    申请号:US17809203

    申请日:2022-06-27

    Applicant: Snowflake Inc.

    Abstract: A system includes one or more processors and data storage containing instructions executable by the one or more processors to perform operations. The operations include detecting a first executed transaction causing a first modification to table data stored in a table. The table data is associated with a corresponding metadata file with metadata information of the table. A new metadata file is generated responsive to the first executed transaction. The new metadata file includes the metadata information and additional metadata associated with the first modification. A second executed transaction causing a second modification to the table data is detected. The table data is updated with a change tracking column. The change tracking column includes lineage of executed transactions on the table data. The lineage indicates at least the first transaction and the second transaction.

    Partitioning to support invocation of external table functions on multiple batches of input rows

    公开(公告)号:US11468079B1

    公开(公告)日:2022-10-11

    申请号:US17646200

    申请日:2021-12-28

    Applicant: Snowflake Inc.

    Abstract: A query referencing an external table function provided by a remote software component is received. Requests to execute the external table function on input data are sent to a proxy service. A first request includes a batch of input rows from the input data. A first response to the first request received from the proxy service includes a first portion of result data and a pagination token. The pagination token indicates that at least a second portion of the result data corresponding to the first batch of input rows is to be obtained from the remote software component. Based on the pagination token, a second request is sent to obtain the second portion of the result data. One or more responses are received from the proxy service that comprise at least the second portion of the result data. The result data is processed according to the query.

    Aggregation operator optimization during query runtime

    公开(公告)号:US11468063B2

    公开(公告)日:2022-10-11

    申请号:US17232821

    申请日:2021-04-16

    Applicant: Snowflake Inc.

    Abstract: The subject technology provides information, corresponding to properties of a build side of a join operation, to a bloom filter. The subject technology, based at least in part on the information from the bloom filter, determines, during executing of a query plan, at least one property of the join operation to determine whether to switch an aggregation operator to a pass through mode, the at least one property comprising at least a reduction rate. The subject technology, switches, in response to the reduction rate being below a threshold value, the aggregation operator to the pass through mode during runtime of the query plan and, while the aggregation operator is in the pass through mode, an input stream of data goes through the aggregation operator without being analyzed and the input stream of data matches an output stream of data flowing out of the aggregation operator.

    System and method for global data sharing

    公开(公告)号:US11463508B1

    公开(公告)日:2022-10-04

    申请号:US17858645

    申请日:2022-07-06

    Applicant: SNOWFLAKE INC.

    Abstract: Sharing data in a data exchange across multiple cloud computing platforms and/or cloud computing platform regions is described. An example computer-implemented method can include receiving data sharing information from a data provider for sharing a data set in a data exchange from a first cloud computing entity to a set of second cloud computing entities. In response to receiving the data sharing information, the method may also include creating an account with each of the set of second cloud computing entities. The method may also further include sharing the data set from the first cloud computing entity with the set of second cloud computing entities using at least the corresponding account of that second cloud computing entity.

Patent Agency Ranking