SCHEMA EVOLUTION
    14.
    发明公开
    SCHEMA EVOLUTION 审中-公开

    公开(公告)号:US20230401180A1

    公开(公告)日:2023-12-14

    申请号:US18345987

    申请日:2023-06-30

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/211

    Abstract: Techniques for schema mismatch detection and evolution are described. When data is being uploaded into a source table, schema of the data to be uploaded can be compared with the schema for the source table. If a schema mismatch is detected, the schema of the source table can be modified, and the upload can be continued without data loss.

    CONFIGURING MANAGED EVENT TABLES USING EXECUTION NODE PROCESSES

    公开(公告)号:US20230147989A1

    公开(公告)日:2023-05-11

    申请号:US17934857

    申请日:2022-09-23

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/254 G06F16/258 G06F16/2282

    Abstract: Techniques for configuring managed event tables include generating at a first process of an execution node, log data associated with execution of user-defined function (UDF) code. The log data is provided from the first process to a second process of the execution node. The first process is configured as a sub-process of the second process. The log data is formatted using the second process of the execution node, to generate formatted log data. The formatting is based on a configuration of a managed event table that is external to the execution node. The formatted log data is communicated from the second process of the execution node into a managed event table maintained at a computing node that is external to the execution node.

    Efficient deduplication of randomized file paths

    公开(公告)号:US11494352B1

    公开(公告)日:2022-11-08

    申请号:US17709234

    申请日:2022-03-30

    Applicant: Snowflake Inc.

    Abstract: Embodiments of the present disclosure provide techniques for deduplicating files to be ingested by a database. A bloom filter may be built for each of a first set of files that are ingested into the database. The set of bloom filters may be stored in a metadata storage associated with the database along with file loading metadata of the first set of files. In response to receiving a set of candidate files to be ingested into the database, one or more candidate files that are duplicative of a file in the first set of files are removed from the set of candidate files, based on file loading metadata of each the first set of files and the set of candidate files to generate a reduced set of candidate files. From the reduced set of candidate files, candidate files that are not duplicative are identified and set for ingestion while candidate files that are potentially duplicative are also identified and set for further scanning.

    CONFIGURING AN EVENT TABLE USING COMPUTING NODE PROCESSES

    公开(公告)号:US20250086193A1

    公开(公告)日:2025-03-13

    申请号:US18954797

    申请日:2024-11-21

    Applicant: Snowflake inc.

    Abstract: Techniques for configuring a managed event table (MET) include detecting, by at least one hardware processor, a query for the MET. The query received at a first computing node of a network-based database system. The method includes retrieving via an ingestion function configured at the first computing node, reformatted data from a dedicated storage location of a first process into the MET. The reformatted data is based on log data associated with a second process. The first process and the second process are executing at a second computing node of the network-based database system. The method includes processing the query using the reformatted data in the MET.

    ALLOCATING TASKS BASED ON LAG OF AN EXECUTION NODE

    公开(公告)号:US20250045112A1

    公开(公告)日:2025-02-06

    申请号:US18923211

    申请日:2024-10-22

    Applicant: Snowflake Inc.

    Abstract: A system and method of allocating tasks based on the lag of one or more execution nodes. The method includes monitoring a plurality of execution nodes of a datastore to determine a plurality of central processing unit (CPU) utilizations, each CPU utilization of the plurality of CPU utilizations is associated with a respective execution node of the plurality of execution nodes. The method includes identifying, by a processing device based on the plurality of CPU utilizations, a particular execution node associated with a maximum CPU utilization to process a task. The method includes determining a lag amount associated with the maximum CPU utilization. The method includes preventing an allocation of the task to the particular execution node for a time period that is equal to or greater than the lag amount.

    Configuring an event table using computing node processes

    公开(公告)号:US12182155B2

    公开(公告)日:2024-12-31

    申请号:US18302515

    申请日:2023-04-18

    Applicant: Snowflake Inc.

    Abstract: Techniques for configuring event tables include retrieving, by at least one hardware processor of a computing node, log data at a first process of the computing node. The log data is associated with a function executing at a second process of the computing node. The log data is revised using a table stage to generate revised log data. The table stage is configured as a dedicated storage location of the first process. The revising includes a data enrichment process based on metadata associated with execution of the function at the second process. The revised log data is ingested into an event table.

Patent Agency Ranking