-
公开(公告)号:US11755576B1
公开(公告)日:2023-09-12
申请号:US18104256
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Qiming Jiang , Orestis Kostakis , John Reumann
IPC: G06F16/00 , G06F16/2453 , G06F16/27
CPC classification number: G06F16/24542 , G06F16/27
Abstract: A system for improving task scheduling on a cloud data platform is provided. A task is received, from a user of a cloud data platform, for execution on a dataset of a cloud data platform using a plurality of resources. A task graph is generated, and metadata related to the dataset is accessed for use in execution of the task. A predicted resource profile is generated by applying a first machine learning scheme to the task graph and the metadata of the dataset. Assignment data is generated to execute processes of the task on the plurality of resources. The assignment data generated by applying a second machine learning scheme to current state data of a current computational state of the plurality of resources and the predicted resource profile generated by the first machine learning scheme.
-
公开(公告)号:US11755291B1
公开(公告)日:2023-09-12
申请号:US18104020
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Jianzhun Du , Orestis Kostakis , Kristopher Wagner , Yijun Xie
Abstract: The subject technology identifies a set of functions in a set of files corresponding to a library. The subject technology, for each function, registers the function as a user defined function (UDF) based on a set of input parameters utilized by the function and a type of parameter of each of the input parameters. The subject technology provides access to each registered function in a different application.
-
公开(公告)号:US11651287B1
公开(公告)日:2023-05-16
申请号:US17816421
申请日:2022-07-31
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis , Justin Langseth
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: Embodiments of the present disclosure may provide a data sharing system implemented as a local application in a consumer database of a distributed database. The local application can include a training function and a scoring function to train a machine learning model on provider and consumer data, and generate output data by applying the trained machine learning model on input data. The input data can include data portions from a consumer database and a provider database that are joined to create a joined dataset for scoring.
-
公开(公告)号:US11188528B1
公开(公告)日:2021-11-30
申请号:US17241745
申请日:2021-04-27
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis
IPC: G06F16/24 , G06F16/245 , G06F11/36
Abstract: Disclosed herein are systems and methods for rapid detection of software bugs in data platforms. One embodiment takes the form of a method that includes a comment-analysis system of a data platform receiving query comments associated with a query that was submitted to the data platform. The data platform determines that the query comments include a reference to a software bug of the data platform, and responsively causes one or more software-bug alerts pertaining to the software bug to be transmitted to one or more endpoints.
-
公开(公告)号:US12222950B2
公开(公告)日:2025-02-11
申请号:US18085452
申请日:2022-12-20
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis , Timur Misirpashaev
IPC: G06F16/24 , G06F16/2453 , G06F16/2455 , G06F16/2457
Abstract: A search engine of a data exchange may receive from a user, a query comprising a set of search terms, and retrieve a set of data listings based on the search terms of the query. A data ranking module of the search engine may analyze each of the set of retrieved data listings to determine, for each of the set of retrieved data listings, a set of listing-specific signals and a set of external signals. Listing-specific signals may correspond to attributes or characteristics of data/content within a data listing, while external signals may correspond to a measure of activity in the data exchange that involves a data listing. Based on the listing-specific signals and the external signals analyzed for each retrieved data listing, the set of retrieved data listings may be ordered and presented to the user.
-
公开(公告)号:US20240411738A1
公开(公告)日:2024-12-12
申请号:US18810750
申请日:2024-08-21
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis , Prasanna V. Krishnan , Subramanian Muralidhar Muralidhar , Shakhina Pulatova , Megan Marie Schoendorf
IPC: G06F16/215 , G06F16/176 , G06F16/2457 , G06F16/25
Abstract: A set of affinity characteristics may be determined for a set of listings, a listing comprising data to be shared through a data exchange, wherein the set of affinity characteristics comprises at least one of: operations performed against a listing; account details and characteristics specific to the listing; static characteristics; or dynamic characteristics. For each pair of listings of the set of listings, an affinity score can be calculated using the set of affinity characteristics, the affinity score indicating a similarity between the pair of listings. One or more listings of the set of listings based on the affinity score between the first listing and the similar listing can be presented.
-
公开(公告)号:US12020128B2
公开(公告)日:2024-06-25
申请号:US18162695
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis , Justin Langseth
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: A method includes installing, in a consumer database account, a shared-instance database that includes a shared instance of a provider-account database that resides in a provider database account. The shared-instance database includes a first schema that includes provider-account training data, provider-account scoring data, a training function, and a scoring function. The method also includes invoking the training function from the consumer database account, which results in creation in the consumer database account of a second schema that includes a machine-learning-model instance of a machine learning model, and which also results in training the machine-learning model instance with at least the provider-account training data. Additionally, the method includes generating consumer-account scoring data by inputting, into the trained machine-learning-model instance, consumer-account input data that is stored in the consumer database account. The method also includes storing the consumer-account scoring data in the consumer database account.
-
公开(公告)号:US20240202203A1
公开(公告)日:2024-06-20
申请号:US18085452
申请日:2022-12-20
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis , Timur Misirpashaev
IPC: G06F16/2457 , G06F16/2453 , G06F16/2455
CPC classification number: G06F16/24578 , G06F16/24542 , G06F16/24564
Abstract: A search engine of a data exchange may receive from a user, a query comprising a set of search terms, and retrieve a set of data listings based on the search terms of the query. A data ranking module of the search engine may analyze each of the set of retrieved data listings to determine, for each of the set of retrieved data listings, a set of listing-specific signals and a set of external signals. Listing-specific signals may correspond to attributes or characteristics of data/content within a data listing, while external signals may correspond to a measure of activity in the data exchange that involves a data listing. Based on the listing-specific signals and the external signals analyzed for each retrieved data listing, the set of retrieved data listings may be ordered and presented to the user.
-
公开(公告)号:US20230409968A1
公开(公告)日:2023-12-21
申请号:US18162695
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis , Justin Langseth
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: A method includes installing, in a consumer database account, a shared-instance database that includes a shared instance of a provider-account database that resides in a provider database account. The shared-instance database includes a first schema that includes provider-account training data, provider-account scoring data, a training function, and a scoring function. The method also includes invoking the training function from the consumer database account, which results in creation in the consumer database account of a second schema that includes a machine-learning-model instance of a machine learning model, and which also results in training the machine-learning model instance with at least the provider-account training data. Additionally, the method includes generating consumer-account scoring data by inputting, into the trained machine-learning-model instance, consumer-account input data that is stored in the consumer database account. The method also includes storing the consumer-account scoring data in the consumer database account.
-
公开(公告)号:US20230385286A1
公开(公告)日:2023-11-30
申请号:US18162688
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Matthew J. Glickman , Orestis Kostakis , Justin Langseth
IPC: G06F16/2455 , G06F16/242
CPC classification number: G06F16/24568 , G06F16/24564 , G06F16/244 , G06F16/2456
Abstract: A system for generating similarity data for different datasets in a cloud data platform. A first dataset of a plurality of datasets on the cloud data platform is identified, where the first dataset is associated with a first user of the cloud data platform. A semantic type for each feature the first dataset is identified, and each semantic type for the first dataset is compared with existing data of the first user. Semantic types for each feature of each dataset are identified, and each semantic type for the first dataset is compared to each semantic type of each dataset. Overlap requests are generated to output overlap datasets between the first dataset and each of the plurality of datasets. A results dataset is generated by applying the overlap requests to a joined dataset comprising data from the first dataset and data from each of the plurality of datasets.
-
-
-
-
-
-
-
-
-