Patent search ap:("Snowflake Inc.") AND inv:"Orestis Kostakis" Page 2

11.

发明授权
Overlap queries on a distributed database 有权

公开(公告)号：US12008001B2

公开(公告)日：2024-06-11

申请号：US17804434

申请日：2022-05-27

Applicant: Snowflake Inc.

Inventor： Matthew J. Glickman , Orestis Kostakis , Justin Langseth

IPC: G06F16/245 , G06F16/24 , G06F16/242 , G06F16/2455

CPC classification number: G06F16/24568 , G06F16/244 , G06F16/2456 , G06F16/24564

Abstract: Systems, methods, and machine-readable storage devices provide for identifying a user dataset on a distributed database. The system includes generating a similarity score dataset that indicates a similarity between the user dataset and a plurality of datasets of other users of the distributed database. The system generates a plurality of overlap queries that are configured to output overlap datasets between the user dataset and one or more of the plurality of datasets. The system further generates a results dataset by applying one or more of the plurality of overlap queries to a joined dataset comprising data from the user dataset and one of the plurality of datasets of other users on the distributed database.

12.

发明授权
Handling system-characteristics drift in machine learning applications 有权

公开(公告)号：US11934927B2

公开(公告)日：2024-03-19

申请号：US18087518

申请日：2022-12-22

Applicant: SNOWFLAKE INC.

Inventor： Orestis Kostakis , Qiming Jiang , Boxin Jiang

IPC: G06N20/00 , G06F16/24

CPC classification number: G06N20/00 , G06F16/24

Abstract: Systems and methods for managing input and output error of a machine learning (ML) model in a database system are presented herein. A set of test queries is executed on a first version of a database system to generate first test data, wherein the first version of the system comprises a ML model to generate an output corresponding to a function of the database system. An error model is trained based on the first test data and second test data generated based on a previous version of the system. The error model determines an error associated with the ML model between the first and previous versions of the system. The first version of the system is deployed with the error model, which corrects an output or an input of the ML model until sufficient data has been produced by the error model to retrain the ML model.

13.

发明授权
Predictive resource allocation for distributed query execution 有权

公开(公告)号：US11880364B2

公开(公告)日：2024-01-23

申请号：US17157233

申请日：2021-01-25

Applicant: Snowflake Inc.

Inventor： Qiming Jiang , Orestis Kostakis

IPC: G06F16/2453 , G06F16/2455 , G06N20/00

CPC classification number: G06F16/24542 , G06F16/2455 , G06N20/00

Abstract: The subject technology receives a query directed to a set of source tables, each source table organized into a set of micro-partitions. The subject technology determines a set of metadata, the set of metadata comprising table metadata, query metadata, and historical data related to the query. The subject technology predicts, using a machine learning model, an indicator of an amount of computing resources for executing the query based at least in part on the set of metadata. The subject technology generates a query plan for executing the query based at least in part on the predicted indicator of the amount of computing resources. The subject technology executes the query based at least in part on the query plan.

14.

发明公开
SIMILARITY-BASED LISTING RECOMMENDATIONS IN A DATA EXCHANGE 审中-公开

公开(公告)号：US20230401185A1

公开(公告)日：2023-12-14

申请号：US18112934

申请日：2023-02-22

Applicant: Snowflake Inc.

Inventor： Orestis Kostakis , Prasanna V. Krishnan , Subramanian Muralidhar , Shakhina Pulatova , Megan Marie Schoendorf

IPC: G06F16/215 , G06F16/2457 , G06F16/25 , G06F16/176

CPC classification number: G06F16/215 , G06F16/24578 , G06F16/256 , G06F16/176

Abstract: A set of affinity metrics may be determined for a set of listings, each listing of the set of listings comprising data to be shared through a data exchange, wherein the set of affinity metrics includes a set of characteristics allowing identification of a listing having one or more characteristics in the set of characteristics. For each pair of listings of the set of listings, an affinity score can be calculated, using the set of affinity metrics, and stored as part of the record in an affinity store. One or more listings of the set of listings using the affinity score between the first listing of the set of listings and the one or more listings of the set of listings can be presented.

15.

发明公开
MULTIPLE USER DEFINED FUNCTIONS REGISTRATION 审中-公开

公开(公告)号：US20230393816A1

公开(公告)日：2023-12-07

申请号：US18362114

申请日：2023-07-31

Applicant: Snowflake Inc.

Inventor： Jianzhun Du , Orestis Kostakis , Kristopher Wagner , Yijun Xie

IPC: G06F8/30 , G06F9/54

CPC classification number: G06F8/315 , G06F9/543

Abstract: The subject technology identifies a set of functions included in a set of files corresponding to a library. The subject technology, for each function in the set of functions, registers the function as a user defined function (UDF). The subject technology generates a name for the function based at least in part on a predetermined prefix, wherein the predetermined prefix comprises an alphanumeric string. The subject technology generates, using at least a particular set of input parameters utilized by the function and a particular type of parameter of each input parameter of the particular set of input parameters, a particular set of source code. The subject technology stores information corresponding to the function in a metadata database. The subject technology provides access to the function in a different application.

16.

发明公开
OVERLAP QUERIES ON A DISTRIBUTED DATABASE 审中-公开

公开(公告)号：US20230385284A1

公开(公告)日：2023-11-30

申请号：US17804434

申请日：2022-05-27

Applicant: Snowflake Inc.

Inventor： Matthew J. Glickman , Orestis Kostakis , Justin Langseth

IPC: G06F16/2455 , G06F16/242

CPC classification number: G06F16/24568 , G06F16/2456 , G06F16/244 , G06F16/24564

Abstract: Systems, methods, and machine-readable storage devices provide for identifying a user dataset on a distributed database. The system includes generating a similarity score dataset that indicates a similarity between the user dataset and a plurality of datasets of other users of the distributed database. The system generates a plurality of overlap queries that are configured to output overlap datasets between the user dataset and one or more of the plurality of datasets. The system further generates a results dataset by applying one or more of the plurality of overlap queries to a joined dataset comprising data from the user dataset and one of the plurality of datasets of other users on the distributed database.

17.

发明申请
BROWSER PLUG-IN FOR MARKETPLACE RECOMMENDATIONS 有权

公开(公告)号：US20250077591A1

公开(公告)日：2025-03-06

申请号：US18238986

申请日：2023-08-28

Applicant: Snowflake Inc.

Inventor： Shuodong Dang , Orestis Kostakis

IPC: G06F16/9532 , G06F9/445 , G06F16/9538

Abstract: A data access event may be recognized, using a browser plug-in, wherein the data access event constitutes a reference to previously obtained data. As a result of recognizing the event, the plug-in may send, to a search engine of a data exchange, a set of extracted terms. The plug-in may receive a set of related data listings related to the set of extracted terms. Upon a selection of a data listing from the set of related data listings, the plug-in may install the data listing to an account.

18.

发明申请
QUERY-EXECUTION PLANNING USING REINFORCEMENT LEARNING 有权

公开(公告)号：US20250021558A1

公开(公告)日：2025-01-16

申请号：US18902195

申请日：2024-09-30

Applicant: Snowflake Inc.

Inventor： Qiming Jiang , Orestis Kostakis , John Reumann

IPC: G06F16/2453 , G06F16/27

Abstract: A method for improving query scheduling on a computing cluster using reinforcement learning is provided. A series of queries to be executed using resources of the computing cluster is received. For each query, a query execution plan is generated and a resource profile for executing the query is predicted. Current state data of the cluster resources is received and assignment data to execute the query on the cluster resources is generated by applying the reinforcement learning technique. The query is executed on the computing cluster based on the generated assignment data, and query results are stored.

19.

发明授权
Software bugs detection using query analysis 有权

公开(公告)号：US11947533B2

公开(公告)日：2024-04-02

申请号：US18318293

申请日：2023-05-16

Applicant: Snowflake Inc.

Inventor： Orestis Kostakis

IPC: G06F16/24 , G06F16/245 , G06F11/36

CPC classification number: G06F16/245 , G06F11/362

Abstract: A method includes parsing, by at least one hardware processor, a query to determine query comments and query code associated with the query. A query execution plan is generated based on the query code. Query execution using the query code is performed at a first computing node associated with a query processing pipeline. A detection is made that the query comments are indicative of a software bug in the query code based on analysis of the query comments. The detection is performed at a second computing node associated with a query analysis pipeline. A notification of the software bug and a result of the query execution is output.

20.

发明公开
AUTOMATED MACHINE LEARNING FOR NETWORK-BASED DATABASE SYSTEMS 审中-公开

公开(公告)号：US20240062098A1

公开(公告)日：2024-02-22

申请号：US17821587

申请日：2022-08-23

Applicant: Snowflake Inc.

Inventor： Rachel Frances Blum , Nancy Dou , Matthew J. Glickman , Boxin Jiang , Orestis Kostakis , Justin Langseth , Michael Earle Rainey , Haoran Yu

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: The subject technology receives first party training data provided by an end-user of a baseline machine learning model. The subject technology determines a first set of common features based on the first party training data. The subject technology receives, from at least one data source. The subject technology determines a second set of common features based on the set of datasets. The subject technology trains, using the first set of common features and the second set of common features, a second machine learning model, the second machine learning model incorporating additional training data from the external data supplier during training compared to the baseline machine learning model. The subject technology generates a boosted machine learning model based at least in part on the training, the boosted machine learning model comprising the trained second machine learning model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification