-
公开(公告)号:US20240403373A1
公开(公告)日:2024-12-05
申请号:US18224443
申请日:2023-07-20
Applicant: Snowflake Inc.
Inventor: Robert K. Chao , Christophe Gaboury , Theodore Kent Hamilton , Neeraj Khanna , Orestis Kostakis , Adil Lalani , Justin Langseth , Haoyue Liu , Arun Muniyandi , Andriy Stasyuk , Xin Wen
IPC: G06F16/9532 , G06F16/9538 , G06F40/166 , G06F40/242 , G06F40/40
Abstract: A search engine of a data exchange may receive a query comprising a set of search terms, retrieve a plurality of data listings based on the search terms of the query, compare a first embedding generated by a large language model (LLM) from the search query to second embeddings generated by the LLM for each of the plurality of data listings to determine a respective relevance for each of the plurality of data listings to the search query, and rank the plurality of data listings based on the respective relevance for each of the plurality of data listings to the search query.
-
公开(公告)号:US12093229B2
公开(公告)日:2024-09-17
申请号:US18112934
申请日:2023-02-22
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis , Prasanna V. Krishnan , Subramanian Muralidhar , Shakhina Pulatova , Megan Marie Schoendorf
IPC: G06F16/00 , G06F16/176 , G06F16/215 , G06F16/2457 , G06F16/25
CPC classification number: G06F16/215 , G06F16/176 , G06F16/24578 , G06F16/256
Abstract: A set of affinity metrics may be determined for a set of listings, each listing of the set of listings comprising data to be shared through a data exchange, wherein the set of affinity metrics includes a set of characteristics allowing identification of a listing having one or more characteristics in the set of characteristics. For each pair of listings of the set of listings, an affinity score can be calculated, using the set of affinity metrics, and stored as part of the record in an affinity store. One or more listings of the set of listings using the affinity score between the first listing of the set of listings and the one or more listings of the set of listings can be presented.
-
公开(公告)号:US20240296162A1
公开(公告)日:2024-09-05
申请号:US18659616
申请日:2024-05-09
Applicant: Snowflake Inc.
Inventor: Matthew J. Glickman , Orestis Kostakis , Justin Langseth
IPC: G06F16/2455 , G06F16/242
CPC classification number: G06F16/24568 , G06F16/244 , G06F16/2456 , G06F16/24564
Abstract: An advanced system for refining overlap queries in a database system based on user feedback. The system monitors interactions of a first user with a first dataset on the database system, where the first dataset is associated with the first user. Feedback regarding the quality of a results dataset, generated from an executed overlap query, is received from the first user. This feedback informs the generation of a similarity score dataset that enhances the creation of new overlap queries. These new overlap queries are designed to output refined overlap datasets between the first dataset and a second dataset associated with a second user. A new joined dataset is generated by executing these overlap queries, comprising data from both the first and second datasets. A new results dataset is generated, providing the first user with refined recommendations based on additional feedback.
-
公开(公告)号:US11836138B1
公开(公告)日:2023-12-05
申请号:US18162688
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Matthew J. Glickman , Orestis Kostakis , Justin Langseth
IPC: G06F16/245 , G06F16/24 , G06F16/2455 , G06F16/242
CPC classification number: G06F16/24568 , G06F16/244 , G06F16/2456 , G06F16/24564
Abstract: A system for generating similarity data for different datasets in a cloud data platform. A first dataset of a plurality of datasets on the cloud data platform is identified, where the first dataset is associated with a first user of the cloud data platform. A semantic type for each feature the first dataset is identified, and each semantic type for the first dataset is compared with existing data of the first user. Semantic types for each feature of each dataset are identified, and each semantic type for the first dataset is compared to each semantic type of each dataset. Overlap requests are generated to output overlap datasets between the first dataset and each of the plurality of datasets. A results dataset is generated by applying the overlap requests to a joined dataset comprising data from the first dataset and data from each of the plurality of datasets.
-
公开(公告)号:US20230281196A1
公开(公告)日:2023-09-07
申请号:US18318293
申请日:2023-05-16
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis
IPC: G06F16/245 , G06F11/36
CPC classification number: G06F16/245 , G06F11/362
Abstract: A method includes parsing, by at least one hardware processor, a query to determine query comments and query code associated with the query. A query execution plan is generated based on the query code. Query execution using the query code is performed at a first computing node associated with a query processing pipeline. A detection is made that the query comments are indicative of a software bug in the query code based on analysis of the query comments. The detection is performed at a second computing node associated with a query analysis pipeline. A notification of the software bug and a result of the query execution is output.
-
公开(公告)号:US11726996B2
公开(公告)日:2023-08-15
申请号:US17654147
申请日:2022-03-09
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis
IPC: G06F16/24 , G06F16/245 , G06F11/36
CPC classification number: G06F16/245 , G06F11/362
Abstract: Disclosed herein are embodiments of systems and methods for analyzing query comments for identifying potential software bugs. In an example, a data platform obtains query comments associated with a query. Based on determining that the query comments include a reference to a software bug of the data platform, the data platform generates a software-bug alert based on the query comments, and transmits the software-bug alert to an endpoint.
-
公开(公告)号:US11620110B1
公开(公告)日:2023-04-04
申请号:US17834668
申请日:2022-06-07
Applicant: Snowflake Inc.
Inventor: Jianzhun Du , Orestis Kostakis , Kristopher Wagner , Yijun Xie
Abstract: The subject technology receives a set of files corresponding to a library, the library comprising a set of functions included in the set of files. The subject technology parses the set of files. The subject technology identifies a set of functions in the set of files based on the parsing. The subject technology, for each function, registers the function as a user defined function (UDF) based on a set of input parameters utilized by the function and a type of parameter of each of the input parameters. The subject technology provides access to each registered function in a different application.
-
公开(公告)号:US20220237192A1
公开(公告)日:2022-07-28
申请号:US17157233
申请日:2021-01-25
Applicant: Snowflake Inc.
Inventor: Qiming Jiang , Orestis Kostakis
IPC: G06F16/2453 , G06N20/00 , G06F16/2455
Abstract: The subject technology receives a query directed to a set of source tables, each source table organized into a set of micro-partitions. The subject technology determines a set of metadata, the set of metadata comprising table metadata, query metadata, and historical data related to the query. The subject technology predicts, using a machine learning model, an indicator of an amount of computing resources for executing the query based at least in part on the set of metadata. The subject technology generates a query plan for executing the query based at least in part on the predicted indicator of the amount of computing resources. The subject technology executes the query based at least in part on the query plan.
-
公开(公告)号:US11243811B1
公开(公告)日:2022-02-08
申请号:US17390265
申请日:2021-07-30
Applicant: Snowflake Inc.
Inventor: Qiming Jiang , Orestis Kostakis , Abdul Munir , Prayag Chandran Nirmala , Jeffrey Rosen
IPC: G06F9/46 , G06F9/50 , G06F16/2455 , G06N5/04 , G06N20/00
Abstract: The subject technology requests information related to usage history metadata from a metadata database. The subject technology receives the requested information from the metadata database, the requested information comprising information related to user demand. The subject technology predicts a size value indicating an amount of computing resources to request for executing a set of queries based on the usage history metadata. The subject technology determines, during a prefetch window of time within a first period of time, a current size of freepool of computing resources. The subject technology, in response to the current size of the freepool of computing resources being smaller than the predicted size value, sends a request for additional computing resources to include in the freepool of computing resources.
-
公开(公告)号:US12242550B1
公开(公告)日:2025-03-04
申请号:US18238986
申请日:2023-08-28
Applicant: Snowflake Inc.
Inventor: Shuodong Dang , Orestis Kostakis
IPC: G06F16/9532 , G06F9/445 , G06F16/9538
Abstract: A data access event may be recognized, using a browser plug-in, wherein the data access event constitutes a reference to previously obtained data. As a result of recognizing the event, the plug-in may send, to a search engine of a data exchange, a set of extracted terms. The plug-in may receive a set of related data listings related to the set of extracted terms. Upon a selection of a data listing from the set of related data listings, the plug-in may install the data listing to an account.
-
-
-
-
-
-
-
-
-