-
公开(公告)号:US11989630B2
公开(公告)日:2024-05-21
申请号:US18162697
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Monica J. Holboke , Justin Langseth , Stuart Ozer , William L. Stratton, Jr.
CPC classification number: G06N20/00 , G06F16/256 , G06F16/283 , G06F18/214 , G06F21/6227
Abstract: A system for providing access to a database management system (DBMS) to a first user of a cloud data platform, the DBMS being generated by a second user. A machine learning model for training on a training dataset is included in the DBMS. The training dataset includes a first training dataset that is encrypted in the DBMS and a second training dataset that includes non-overlapping features with the first training dataset. A request, from the second user, to train the machine learning model on the first and second training datasets is identified. A trained machine learning model is generated by training the machine learning model on a joined dataset according to the request. One or more outputs from the trained machine learning model are generated by applying the trained machine learning model on new data.
-
公开(公告)号:US11775544B2
公开(公告)日:2023-10-03
申请号:US18162522
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Simon A. Field , Stuart Ozer
IPC: G06F16/22 , G06F16/2455 , G06F16/25 , G06F16/84
CPC classification number: G06F16/25 , G06F16/2282 , G06F16/24558 , G06F16/86
Abstract: The subject technology receives by a database system, raw input data from a source table provided by an external environment, the source table comprising multiple rows and multiple columns, the raw input data comprising values in a first format, the values comprising input features corresponding to datasets included in the raw input data for machine learning models, the external environment comprising an external system from the database system and is accessed by different users. The subject technology generates cell data for a second table based on the values from the source table. The subject technology performs a database operation to generate the second table including table metadata, column metadata, and the generated cell data.
-
公开(公告)号:US20230186160A1
公开(公告)日:2023-06-15
申请号:US18055248
申请日:2022-11-14
Applicant: Snowflake Inc.
Inventor: Monica J. Holboke , Justin Langseth , Stuart Ozer , William L. Stratton, JR.
IPC: G06N20/00 , G06F21/62 , G06F16/25 , G06F16/28 , G06F18/214
CPC classification number: G06N20/00 , G06F21/6227 , G06F16/256 , G06F16/283 , G06F18/214
Abstract: Disclosed are systems, methods, and non-transitory computer-readable media for sharing, on a distributed database, a database application to a first user of the distributed database, the database application generated by a second user of the distributed database. The training dataset includes a first database training dataset from the first user of the distributed database and a second database training dataset from the second user of the distributed database, the first database training dataset and the second database training dataset including non-overlapping dataset features. The database application further identifies a query from the second user to train the machine learning model on the training dataset and generates a trained machine learning model by training the machine learning model on a joined dataset according to the query. The database application generates outputs from the trained machine learning model by applying the trained machine learning model on new data.
-
公开(公告)号:US20220292213A1
公开(公告)日:2022-09-15
申请号:US17644732
申请日:2021-12-16
Applicant: Snowflake Inc.
Inventor: Monica J. Holboke , Justin Langseth , Stuart Ozer , William L. Stratton, JR.
Abstract: A secure machine learning system of a database system can be implemented to use secure shared data to train a machine learning model. To manage the model, a first user of the database can share data in an encrypted view with a second user of the database, and further share one or more functions of an application that accesses the data while the data is encrypted. The second user can access functions of the application and can call the functions to generate a trained machine learning model and further generate machine learning outputs (e.g., predictions) from the trained model.
-
公开(公告)号:US11216580B1
公开(公告)日:2022-01-04
申请号:US17232859
申请日:2021-04-16
Applicant: Snowflake Inc.
Inventor: Monica J. Holboke , Justin Langseth , Stuart Ozer , William L. Stratton, Jr.
Abstract: A secure machine learning system of a database system can be implemented to use secure shared data to train a machine learning model. To manage the model, a first user of the database can share data in an encrypted view with a second user of the database, and further share one or more functions of an application that accesses the data while the data is encrypted. The second user can access functions of the application and can call the functions to generate a trained machine learning model and further generate machine learning outputs (e.g., predictions) from the trained model.
-
公开(公告)号:US12204553B2
公开(公告)日:2025-01-21
申请号:US18458425
申请日:2023-08-30
Applicant: Snowflake Inc.
Inventor: Simon A. Field , Stuart Ozer
IPC: G06F16/22 , G06F16/2455 , G06F16/25 , G06F16/84
Abstract: The subject technology generates, by a database system, cell data for a particular table based on values from a source table, the values being based on raw input data, the source table comprising multiple rows and multiple columns, the raw input data comprising values in a first format, the values comprising input features corresponding to datasets included in the raw input data for machine learning models, the source table being provided by an external environment, the external environment comprising an external system from the database system. The subject technology performs a database operation to generate the particular table including table metadata, column metadata, and the generated cell data, the generated particular table comprising a second format that causes more efficient processing of data by the database system using a single query on the particular table compared to processing the raw input data from the source table.
-
公开(公告)号:US20230177063A1
公开(公告)日:2023-06-08
申请号:US18162522
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Simon A. Field , Stuart Ozer
IPC: G06F16/25 , G06F16/2455 , G06F16/84 , G06F16/22
CPC classification number: G06F16/25 , G06F16/24558 , G06F16/86 , G06F16/2282
Abstract: The subject technology receives by a database system, raw input data from a source table provided by an external environment, the source table comprising multiple rows and multiple columns, the raw input data comprising values in a first format, the values comprising input features corresponding to datasets included in the raw input data for machine learning models, the external environment comprising an external system from the database system and is accessed by different users. The subject technology generates cell data for a second table based on the values from the source table. The subject technology performs a database operation to generate the second table including table metadata, column metadata, and the generated cell data.
-
公开(公告)号:US20230169407A1
公开(公告)日:2023-06-01
申请号:US18162697
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Monica J. Holboke , Justin Langseth , Stuart Ozer , William L. Stratton, JR.
IPC: G06N20/00 , G06F21/62 , G06F16/25 , G06F16/28 , G06F18/214
CPC classification number: G06N20/00 , G06F21/6227 , G06F16/256 , G06F16/283 , G06F18/214
Abstract: A system for providing access to a database management system (DBMS) to a first user of a cloud data platform, the DBMS being generated by a second user. A machine learning model for training on a training dataset is included in the DBMS. The training dataset includes a first training dataset that is encrypted in the DBMS and a second training dataset that includes non-overlapping features with the first training dataset. A request, from the second user, to train the machine learning model on the first and second training datasets is identified. A trained machine learning model is generated by training the machine learning model on a joined dataset according to the request. One or more outputs from the trained machine learning model are generated by applying the trained machine learning model on new data.
-
公开(公告)号:US11609927B2
公开(公告)日:2023-03-21
申请号:US17899160
申请日:2022-08-30
Applicant: Snowflake Inc.
Inventor: Simon A. Field , Stuart Ozer
IPC: G06F16/25 , G06F16/84 , G06F16/2455 , G06F16/22
Abstract: The subject technology receives, by a database system, raw input data from a source table provided by a machine learning development environment, the source table comprising multiple rows where each row includes multiple columns, the raw input data comprising values in a first format, the values comprising input features corresponding to datasets included in the raw input data for machine learning models, the machine learning development environment comprising an external system from the database system and is accessed by a plurality of different users that are external to the database system. The subject technology generates cell data for a feature store table based at least in part on the values from the source table. The subject technology performs at least one database operation to generate the feature store table including at least table metadata, column metadata, and the generated cell data.
-
公开(公告)号:US11893462B2
公开(公告)日:2024-02-06
申请号:US18055248
申请日:2022-11-14
Applicant: Snowflake Inc.
Inventor: Monica J. Holboke , Justin Langseth , Stuart Ozer , William L. Stratton, Jr.
CPC classification number: G06N20/00 , G06F16/256 , G06F16/283 , G06F18/214 , G06F21/6227
Abstract: Disclosed are systems, methods, and non-transitory computer-readable media for sharing, on a distributed database, a database application to a first user of the distributed database, the database application generated by a second user of the distributed database. The training dataset includes a first database training dataset from the first user of the distributed database and a second database training dataset from the second user of the distributed database, the first database training dataset and the second database training dataset including non-overlapping dataset features. The database application further identifies a query from the second user to train the machine learning model on the training dataset and generates a trained machine learning model by training the machine learning model on a joined dataset according to the query. The database application generates outputs from the trained machine learning model by applying the trained machine learning model on new data.
-
-
-
-
-
-
-
-
-