-
公开(公告)号:US20210152553A1
公开(公告)日:2021-05-20
申请号:US16911850
申请日:2020-06-25
Applicant: Snowflake Inc.
Inventor: Polita Paulus , Peter Povinec , Saurin Shah , Srinidhi Karthik Bisthavalli Srinivasa
Abstract: A command to load or unload data at a storage location is received. In response to the command, a storage integration object associated with the storage location is identified. The storage integration object identifies a cloud identity object that corresponds to a cloud identity that is associated with a proxy identity object corresponding to a proxy identity granted permission to access the storage location. The data is loaded or unloaded at the storage location by assuming the proxy identity.
-
公开(公告)号:US12118038B2
公开(公告)日:2024-10-15
申请号:US18063253
申请日:2022-12-08
Applicant: Snowflake Inc.
Inventor: Elliott Brossard , Sukruth Komarla Sukumar , Isaac Kunen , Ju-Yi Kuo , Jonathan Lee Leang , Edward Ma , Schuyler James Manchester , Polita Paulus , Saurin Shah , Igor Zinkovsky
IPC: G06F16/00 , G06F16/22 , G06F16/2455 , G06F16/901 , G06F16/908 , G06F16/955
CPC classification number: G06F16/9017 , G06F16/2282 , G06F16/24568 , G06F16/908 , G06F16/955
Abstract: A method includes decoding, by at least one hardware processor, a request for a user-defined function (UDF). The request includes a reference to one or more files. The method further includes generating, by the at least one hardware processor, the UDF based on the request. The UDF includes a file reference object with file path information corresponding to the reference. The file path information identifies a file path to the one or more files. A UDF call into the UDF is detected. The UDF call specifies the file path information. The UDF call is processed to generate result data using the one or more files.
-
公开(公告)号:US20240338577A1
公开(公告)日:2024-10-10
申请号:US18416379
申请日:2024-01-18
Applicant: Snowflake Inc.
Inventor: Michal Gdak , Ganeshan Ramachandran Iyer , Tomasz Malisz , Mikolaj Niedbala , Pawel Pollak , Saurin Shah , Jan Tomasz Topinski , Daria Wieteska
IPC: G06N5/022
CPC classification number: G06N5/022
Abstract: Systems and methods for generating a machine-learning (ML) model for extracting information from one or more electronic documents, where the ML model can be used as a data object, which can be part of a database command or as part of a document information extraction process that is continuously running (e.g., document information extraction pipeline).
-
公开(公告)号:US20240211491A1
公开(公告)日:2024-06-27
申请号:US18599647
申请日:2024-03-08
Applicant: Snowflake Inc.
Inventor: Vasile Paraschiv , Saurin Shah , Marianne Shaw , Nileema Shingte
IPC: G06F16/27 , G06F9/30 , G06F16/11 , G06F16/13 , G06F16/182 , G06F16/22 , G06F16/2455 , G06F16/25 , G06F16/28
CPC classification number: G06F16/278 , G06F9/3009 , G06F16/137 , G06F16/182 , G06F16/2282 , G06F16/24554 , G06F16/283 , G06F16/116 , G06F16/254
Abstract: A database export system exports data using a plurality of nodes that process the data to generate structured result files that are partitioned by an export parameter in an export request. The database export system distributes the data and merges the files to avoid small file creation and increase processing speed via parallelism. The database export system generates the result files of a specified maximum size in a final format, where the files are processed merged in a temporary file format. The parallel processing is optimized and constrained per the amount of processing nodes, available memory, requested final file sizes, and operation based ordering to complete data exports in a scalable multi-stage approach.
-
公开(公告)号:US11876802B2
公开(公告)日:2024-01-16
申请号:US18054621
申请日:2022-11-11
Applicant: Snowflake Inc.
Inventor: Polita Paulus , Peter Povinec , Saurin Shah , Srinidhi Karthik Bisthavalli Srinivasa
CPC classification number: H04L63/0884 , G06F16/254 , H04L63/107 , H04L63/126 , H04L2463/081
Abstract: A command to load or unload data at a storage location is received. In response to the command, a storage integration object associated with the storage location is identified. The storage integration object identifies a cloud identity object that corresponds to a cloud identity that is associated with a proxy identity object corresponding to a proxy identity granted permission to access the storage location. The data is loaded or unloaded at the storage location by assuming the proxy identity.
-
公开(公告)号:US20230401329A1
公开(公告)日:2023-12-14
申请号:US17933761
申请日:2022-09-20
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Polita Paulus , Saurin Shah , Srinidhi Karthik Bisthavalli Srinivasa
CPC classification number: G06F21/6218 , G06F16/211
Abstract: Provided herein are systems and methods for sharing unstructured data in stages. For example, a method includes generating a share object at an account of a data provider. The share object identifies an account of a data consumer and at least one unstructured data file shared with the account of the data consumer. The share object is configured with access privileges to the at least one unstructured data file. A notification of the share object is communicated to the account of the data consumer.
-
公开(公告)号:US20230401229A1
公开(公告)日:2023-12-14
申请号:US18051657
申请日:2022-11-01
Applicant: Snowflake Inc.
Inventor: Robert Bengt Benedikt Gernhardt , Chong Han , Nithin Mahesh , Aravind Ramarathinam , Saurin Shah , Yanrui Zhang
CPC classification number: G06F16/27 , G06F16/256
Abstract: The distributed database can implement unstructured data replication using an internal or external storage location. Metadata, such as a directory table that lists the unstructured files, can be replicated across different deployments, followed by replication of the staged data. Replicating the staged data can be implemented by replication of only the stage metadata or replication of the database files between the deployments.
-
公开(公告)号:US11550845B2
公开(公告)日:2023-01-10
申请号:US17657548
申请日:2022-03-31
Applicant: Snowflake Inc.
Inventor: Elliott Brossard , Sukruth Komarla Sukumar , Isaac Kunen , Ju-yi Kuo , Jonathan Lee Leang , Edward Ma , Schuyler James Manchester , Polita Paulus , Saurin Shah , Igor Zinkovsky
IPC: G06F16/00 , G06F16/901 , G06F16/955 , G06F16/2455 , G06F16/22 , G06F16/908
Abstract: A file access system for user defined functions (UDFs) can be implemented on a distributed database system. The system can store UDF interfaces and file reference objects that can be called by other users. Upon a UDF being called, files on a stage, one or more interface objects (e.g., InputStream), and file reference objects can be implemented by execution nodes of the distributed database system. The execution nodes can implement multiple threads that are authenticated and can download file data from a staging location concurrently.
-
公开(公告)号:US20220391408A1
公开(公告)日:2022-12-08
申请号:US17396576
申请日:2021-08-06
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Polita Paulus , Sahaj Saini , Saurin Shah , Srinidhi Karthik Bisthavalli Srinivasa
IPC: G06F16/27 , G06F16/25 , G06F16/955
Abstract: The embodiments described herein provide means for replicating external stages between deployments of e.g., a cloud data lake using a modified storage integration. The modified storage integration may be defined with multiple storage locations that it can point to, as well as a designation of an active storage location. The storage integration may also be defined with base file paths for each storage location as well as a relative file path which together may serve to synchronize data loading operations between deployments when e.g., a fail-over occurs from one deployment to another. The storage integration may be replicated from a first deployment to a second deployment, and when database replication occurs, an external stage may be replicated to the second deployment and bound to the replicated storage integration. Thus, a fail-over to the second deployment may result in a seamless transition of data loading processes to the second deployment.
-
公开(公告)号:US11361026B2
公开(公告)日:2022-06-14
申请号:US17463325
申请日:2021-08-31
Applicant: Snowflake Inc.
Inventor: Elliott Brossard , Sukruth Komarla Sukumar , Isaac Kunen , Ju-yi Kuo , Jonathan Lee Leang , Edward Ma , Schuyler James Manchester , Polita Paulus , Saurin Shah , Igor Zinkovsky
IPC: G06F16/00 , G06F17/30 , G06F16/901 , G06F16/955 , G06F16/2455 , G06F16/22 , G06F16/908
Abstract: A file access system for user defined functions (UDFs) can be implemented on a distributed database system. The system can store UDF interfaces and file reference objects that can be called by other users. Upon a UDF being called, files on a stage, one or more interface objects (e.g., InputStream), and file reference objects can be implemented by execution nodes of the distributed database system. The execution nodes can implement multiple threads that are authenticated and can download file data from a staging location concurrently.
-
-
-
-
-
-
-
-
-