STAGE REPLICATION IN A CLOUD DATA LAKE
    121.
    发明公开

    公开(公告)号:US20230214405A1

    公开(公告)日:2023-07-06

    申请号:US18119775

    申请日:2023-03-09

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/27 G06F16/9566 G06F16/254

    Abstract: The embodiments described herein provide means for replicating external stages between deployments of e.g., a cloud data lake using a modified storage integration. The modified storage integration may be defined with a set of storage locations, wherein the storage integration comprises a base URL for each of the set of storage locations and wherein each storage location identifies a remote deployment where a cloud platform is hosted and a geographic region of the remote deployment. An external stage object may be bound to the storage integration, wherein the external stage object facilitates a data loading operation that is currently in progress on the first storage location. In response to detecting an outage at the first storage location, the data loading operation that was in progress on the first storage location may be continued at the second storage location using the storage integration.

    Estimated execution time for query execution

    公开(公告)号:US11687531B2

    公开(公告)日:2023-06-27

    申请号:US18051185

    申请日:2022-10-31

    Applicant: Snowflake Inc.

    Abstract: The subject technology tracks a plurality of queries corresponding to a plurality of query plans based on join operations contained in each of the plurality of queries and a previous time of executing each query. The subject technology selects a first query plan among the plurality of query plans. The subject technology determines a value indicating an estimated improvement in execution time of the first query plan in comparison to a previous execution time of a previous query plan. The subject technology attempts to execute a first query using the first query plan. The subject technology determines that a second query plan selected among the plurality of query plans has a second estimated execution time that is less than an estimated execution time of the first query plan. The subject technology executes the first query corresponding to the first query plan at a subsequent time using the second query plan.

    MACHINE LEARNING USING SECURED SHARED DATA
    125.
    发明公开

    公开(公告)号:US20230186160A1

    公开(公告)日:2023-06-15

    申请号:US18055248

    申请日:2022-11-14

    Applicant: Snowflake Inc.

    Abstract: Disclosed are systems, methods, and non-transitory computer-readable media for sharing, on a distributed database, a database application to a first user of the distributed database, the database application generated by a second user of the distributed database. The training dataset includes a first database training dataset from the first user of the distributed database and a second database training dataset from the second user of the distributed database, the first database training dataset and the second database training dataset including non-overlapping dataset features. The database application further identifies a query from the second user to train the machine learning model on the training dataset and generates a trained machine learning model by training the machine learning model on a joined dataset according to the query. The database application generates outputs from the trained machine learning model by applying the trained machine learning model on new data.

    DATA CLEAN ROOMS USING DEFINED ACCESS WITH HOMOMORPHIC ENCRYPTION

    公开(公告)号:US20230177210A1

    公开(公告)日:2023-06-08

    申请号:US18162506

    申请日:2023-01-31

    Applicant: Snowflake Inc.

    CPC classification number: G06F21/6245 G06F21/53 G06F2221/032

    Abstract: A data platform creates an application in a data-provider account, where the application includes one or more application programming interfaces (APIs) corresponding to one or more underlying code blocks. The data platform shares homomorphically encrypted provider data with the application in the data-provider account. The data platform installs, in a data-consumer account, an application instance of the application. The data platform shares homomorphically encrypted consumer data with the application instance in the data-consumer account. The data platform invokes one or more of the APIs of the application instance to execute respective associated underlying code blocks, which are not visible to the data-consumer account, and which operate on the shared homomorphically encrypted provider data and the shared homomorphically encrypted consumer data. The data platform saves homomorphically encrypted output of the one or more respective associated underlying code blocks locally within the data-consumer account.

Patent Agency Ranking