-
公开(公告)号:US20240354315A1
公开(公告)日:2024-10-24
申请号:US18302234
申请日:2023-04-18
Applicant: SNOWFLAKE INC.
Inventor: Varun Ganesh , Alvin E. Jou , Donghe Kang , Ryan Michael Thomas Shelly , Jiaqi Yan , Yizhi Zhu
IPC: G06F16/28 , G06F16/2455
CPC classification number: G06F16/285 , G06F16/24556
Abstract: A method for selecting micro-partitions for a clustering operation includes: storing table data in a plurality of micro-partitions of a storage device, wherein each of the plurality of micro-partitions comprises a portion of the table data, wherein subsets of the plurality of micro-partitions are associated with a respective one of a plurality of expression property (EP) files, and wherein each of the plurality of EP files comprises an EP data region that represents the portions of the table data of the subset of the plurality of micro-partitions associated with the EP file; determining sub-ranges of the table data based on the EP data regions of the plurality of EP files; selecting a subset of the plurality of EP files for a clustering operation based on the sub-ranges of the table data; and performing the clustering operation on the micro-partitions associated with the subset of the EP files.
-
公开(公告)号:US20240346334A1
公开(公告)日:2024-10-17
申请号:US18756936
申请日:2024-06-27
Applicant: SNOWFLAKE INC.
Inventor: David Jensen
IPC: G06N5/01 , G06F16/2455
CPC classification number: G06N5/01 , G06F16/24564
Abstract: An approach is disclosed that determines a path through multiple levels of a generalization lattice. The path includes multiple nodes corresponding to the multiple levels, and each of the nodes is determined from a scoring function that utilizes a corresponding parent node that was previously added to the path. The approach then selects an optimal node from the nodes in the path.
-
公开(公告)号:US20240338521A1
公开(公告)日:2024-10-10
申请号:US18629693
申请日:2024-04-08
Applicant: Snowflake Inc.
Inventor: Michal Gdak , Ganeshan Ramachandran Iyer , Tomasz Malisz , Mikolaj Niedbala , Pawel Pollak , Saurin Shah , Jan Tomasz Topinski
IPC: G06F40/226
CPC classification number: G06F40/226
Abstract: Systems and methods for: processing a current electronic document, using a set of machine-learning (ML) models, to extract a set of values for a set of data points based on a schema, where the schema describes the set of data points to be extracted from electronic documents; determining whether to select the current electronic document for human validation based on the schema; and adding the current electronic document to a human validation queue in response to determining to select the current electronic document for human validation based on the schema.
-
公开(公告)号:US20240330319A1
公开(公告)日:2024-10-03
申请号:US18738875
申请日:2024-06-10
Applicant: Snowflake Inc.
Inventor: Benoit Dageville , Thierry j:;ruanes , Marcin Zukowski
IPC: G06F16/27 , A61F5/56 , G06F9/48 , G06F9/50 , G06F16/14 , G06F16/182 , G06F16/21 , G06F16/22 , G06F16/23 , G06F16/2453 , G06F16/2455 , G06F16/2458 , G06F16/25 , G06F16/28 , G06F16/951 , G06F16/9535 , G06F16/9538 , H04L67/1095 , H04L67/1097 , H04L67/568
CPC classification number: G06F16/273 , A61F5/566 , G06F9/4881 , G06F9/5016 , G06F9/5044 , G06F9/5083 , G06F9/5088 , G06F16/148 , G06F16/1827 , G06F16/211 , G06F16/221 , G06F16/2365 , G06F16/24532 , G06F16/24545 , G06F16/24552 , G06F16/2456 , G06F16/2471 , G06F16/254 , G06F16/27 , G06F16/283 , G06F16/951 , G06F16/9535 , G06F16/9538 , H04L67/1095 , H04L67/1097 , H04L67/568
Abstract: Example resource management systems and methods are described. In one implementation, a system includes a memory and a processing device operatively coupled to the memory. The processing device is to receive a query referencing database data stored in a storage platform, determine a task associated with processing the received query, and create an execution node comprising cache resources and processing resources. Furthermore, a size of the cache resources of the execution node is determined upon creation of the execution node, based at least in part on the task, and processing resources of the execution node are determined upon creation of the execution node, based at least in part on the task. The execution node is included within a plurality of execution nodes to process the task associated with processing the received query.
-
公开(公告)号:US12105828B2
公开(公告)日:2024-10-01
申请号:US18227818
申请日:2023-07-28
Applicant: Snowflake Inc.
Inventor: Vikas Jain , Eric Karlson , Sepideh Khoshnood
CPC classification number: G06F21/6227 , G06F21/604 , G06F21/6218 , H04L63/10 , H04L63/102 , H04L63/105 , H04L63/101 , H04L63/104 , H04L63/107
Abstract: Embodiments of the present disclosure provide systems and methods for using inherited grants to grant privileges to objects in a container. An inherited grant may be generated that specifies a permission on a first type of object in a container and a grant of the permission to a role. The inherited grant may be attached to the container, wherein the container includes a set of objects of the first type. In response to a first object of the set of objects being referenced via the role, a virtual implied grant may be created based on the inherited grant. Authorization of utilization of the permission on the first object is performed using the virtual implied grant, wherein the virtual implied grant is transient and exists in-memory only for the purpose of authorizing the utilization of the permission on the first object.
-
公开(公告)号:US20240320202A1
公开(公告)日:2024-09-26
申请号:US18678357
申请日:2024-05-30
Applicant: Snowflake Inc.
Inventor: Vlad Bunescu , Joshua Klahr , Louis Magarshack , Shiyu Qu , Zerui Wei , Jiaqi Yan
CPC classification number: G06F16/2228 , G06F16/254
Abstract: Methods, systems, and computer programs are described for tracking evaluation of workload stability through performance indexing. A plurality of metric source data is received by at least one hardware processor. Based on this data, a workload is identified as a stable workload candidate. A performance index is then generated, reflecting the characteristics of the identified stable workload candidate. The performance index is continuously tracked over a period of time, enabling the detection and analysis of any modifications to the workload and the subsequent impact on system performance.
-
公开(公告)号:US12101294B2
公开(公告)日:2024-09-24
申请号:US18341954
申请日:2023-06-27
Applicant: Snowflake Inc.
Inventor: Robert Bengt Benedikt Gernhardt , Mikhail Kazhamiaka , Nithin Mahesh , Eric Robinson
IPC: H04L9/40
CPC classification number: H04L63/0218 , H04L63/0236 , H04L63/0245
Abstract: Different database deployments, or other data system deployments, may want to communicate with each other without sacrificing security or control. To this end, embodiments of the present disclosure may provide secure message exchange techniques for a source and/or target deployment. Configurable rule sets may be stored in the deployments; the rule sets may define what messages may be communicated between deployments. The deployments may implement a selective filtering scheme in one or more stages based on the rule sets to filter outgoing and/or incoming messages.
-
公开(公告)号:US12093229B2
公开(公告)日:2024-09-17
申请号:US18112934
申请日:2023-02-22
Applicant: Snowflake Inc.
Inventor: Orestis Kostakis , Prasanna V. Krishnan , Subramanian Muralidhar , Shakhina Pulatova , Megan Marie Schoendorf
IPC: G06F16/00 , G06F16/176 , G06F16/215 , G06F16/2457 , G06F16/25
CPC classification number: G06F16/215 , G06F16/176 , G06F16/24578 , G06F16/256
Abstract: A set of affinity metrics may be determined for a set of listings, each listing of the set of listings comprising data to be shared through a data exchange, wherein the set of affinity metrics includes a set of characteristics allowing identification of a listing having one or more characteristics in the set of characteristics. For each pair of listings of the set of listings, an affinity score can be calculated, using the set of affinity metrics, and stored as part of the record in an affinity store. One or more listings of the set of listings using the affinity score between the first listing of the set of listings and the one or more listings of the set of listings can be presented.
-
公开(公告)号:US12086287B2
公开(公告)日:2024-09-10
申请号:US17980371
申请日:2022-11-03
Applicant: SNOWFLAKE INC.
Inventor: David Jensen , Joseph David Jensen
CPC classification number: G06F21/6254 , G06F16/221 , G06F16/282 , G06F21/6227
Abstract: A method receives data from a data source. The method generates a plurality of generalizations of the data. The method sends the plurality of generalizations of the data to a plurality of execution nodes, wherein each of the plurality of execution nodes includes computational resources to compute a candidate generalization using an information loss scoring function. The method receives a candidate generalization from each of the plurality of execution nodes. The method selects a preferred generalization from the plurality of candidate generalizations. The method generates an anonymized view of the data set using the preferred generalization.
-
公开(公告)号:US20240296162A1
公开(公告)日:2024-09-05
申请号:US18659616
申请日:2024-05-09
Applicant: Snowflake Inc.
Inventor: Matthew J. Glickman , Orestis Kostakis , Justin Langseth
IPC: G06F16/2455 , G06F16/242
CPC classification number: G06F16/24568 , G06F16/244 , G06F16/2456 , G06F16/24564
Abstract: An advanced system for refining overlap queries in a database system based on user feedback. The system monitors interactions of a first user with a first dataset on the database system, where the first dataset is associated with the first user. Feedback regarding the quality of a results dataset, generated from an executed overlap query, is received from the first user. This feedback informs the generation of a similarity score dataset that enhances the creation of new overlap queries. These new overlap queries are designed to output refined overlap datasets between the first dataset and a second dataset associated with a second user. A new joined dataset is generated by executing these overlap queries, comprising data from both the first and second datasets. A new results dataset is generated, providing the first user with refined recommendations based on additional feedback.
-
-
-
-
-
-
-
-
-