-
公开(公告)号:US20220036246A1
公开(公告)日:2022-02-03
申请号:US16942247
申请日:2020-07-29
发明人: Bei Chen , Long VU , Syed Yousaf Shah , Xuan-Hong Dang , Peter Daniel Kirchner , Si Er Han , Ji Hui Yang , Jun Wang , Jing James Xu , Dakuo Wang , Dhavalkumar C. Patel , Gregory Bramble , Horst Cornelius Samulowitz , Saket Sathe , Chuang Gan
IPC分类号: G06N20/20
摘要: Techniques regarding one or more automated machine learning processes that analyze time series data are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a time series analysis component that selects a machine learning pipeline for meta transfer learning on time series data by sequentially allocating subsets of training data from the time series data amongst a plurality of machine learning pipeline candidates.
-
公开(公告)号:US20220343207A1
公开(公告)日:2022-10-27
申请号:US17237379
申请日:2021-04-22
发明人: Long Vu , Saket Sathe , Bei Chen , Peter Daniel Kirchner
摘要: In a method for ranking machine learning (ML) pipelines for a dataset, a processor receives first performance curves predicted by a meta learner model for a plurality of ML pipelines. A processor allocates a first subset of data points from the dataset to each of the plurality of ML pipelines. A processor receives first performance scores for each of the ML pipelines for the first subset of data points. A processor updates the meta learner model using the first performance scores. A processor receives second performance curves from the meta learner model updated with the first performance scores. A processor ranks the plurality of ML pipelines based on the second performance curves.
-
公开(公告)号:US20230177387A1
公开(公告)日:2023-06-08
申请号:US17643242
申请日:2021-12-08
发明人: Saket Sathe , Long Vu , Peter Daniel Kirchner , Charu C. Aggarwal
IPC分类号: G06N20/00
CPC分类号: G06N20/00
摘要: A method, system, and computer program product for a metalearner for automated machine learning are provided. The method receives a labeled data set. A set of data subsets is generated from the labeled data set. A set of unsupervised machine learning pipelines is generated. A training set is generated from the set of data subsets and the set of unsupervised machine learning pipelines. The method trains a metalearner for unsupervised tasks based on the training set.
-
公开(公告)号:US20220004914A1
公开(公告)日:2022-01-06
申请号:US16919258
申请日:2020-07-02
发明人: Peter Daniel Kirchner , Gregory Bramble , Horst Cornelius Samulowitz , Dakuo Wang , Arunima Chaudhary , Gregory Filla
摘要: An embodiment of the invention may include a method, computer program product, and system for creating a data analysis tool. The method may include a computing device that generates an AI pipeline based on an input dataset, wherein the AI pipeline is generated using an Automated Machine Learning program. The method may include converting the AI pipeline to a non-native format of the Automated Machine Learning program. This may enable the AI pipeline to be used outside of the Automated Machine Learning program, thereby increasing the usefulness of the created program by not tying it to the Automated Machine Learning program. Additionally, this may increase the efficiency of running the AI pipeline by eliminating unnecessary computations performed by the Automated Machine Learning program.
-
公开(公告)号:US11966340B2
公开(公告)日:2024-04-23
申请号:US17654965
申请日:2022-03-15
发明人: Long Vu , Bei Chen , Xuan-Hong Dang , Peter Daniel Kirchner , Syed Yousaf Shah , Dhavalkumar C. Patel , Si Er Han , Ji Hui Yang , Jun Wang , Jing James Xu , Dakuo Wang , Gregory Bramble , Horst Cornelius Samulowitz , Saket K. Sathe , Wesley M. Gifford , Petros Zerfos
IPC分类号: G06F12/0871 , G06N20/00
CPC分类号: G06F12/0871 , G06N20/00 , G06F2212/604
摘要: To automate time series forecasting machine learning pipeline generation, a data allocation size of time series data may be determined based on one or more characteristics of a time series data set. The time series data may be allocated for use by candidate machine learning pipelines based on the data allocation size. Features for the time series data may be determined and cached by the candidate machine learning pipelines. Predictions of each of the candidate machine learning pipelines using at least the one or more features may be evaluated. A ranked list of machine learning pipelines may be automatically generated from the candidate machine learning pipelines for time series forecasting based upon evaluating predictions of each of the one or more candidate machine learning pipelines.
-
公开(公告)号:US11868230B2
公开(公告)日:2024-01-09
申请号:US17692268
申请日:2022-03-11
CPC分类号: G06F11/3452 , G06F11/3428 , G06N20/00
摘要: Computer hardware and/or software that performs the following operations: (i) assessing a performance of a plurality of unsupervised machine learning pipelines against a plurality of data sets; (ii) associating the performance with meta-features corresponding to respective pipeline/data set combinations; (iii) training a supervised meta-learning model using the associated performance and meta-features as training data; and (iv) utilizing the trained model to identify one or more pipelines for processing an input data set.
-
公开(公告)号:US11620582B2
公开(公告)日:2023-04-04
申请号:US16942247
申请日:2020-07-29
发明人: Bei Chen , Long Vu , Syed Yousaf Shah , Xuan-Hong Dang , Peter Daniel Kirchner , Si Er Han , Ji Hui Yang , Jun Wang , Jing James Xu , Dakuo Wang , Dhavalkumar C. Patel , Gregory Bramble , Horst Cornelius Samulowitz , Saket Sathe , Chuang Gan
IPC分类号: G06N20/20
摘要: Techniques regarding one or more automated machine learning processes that analyze time series data are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a time series analysis component that selects a machine learning pipeline for meta transfer learning on time series data by sequentially allocating subsets of training data from the time series data amongst a plurality of machine learning pipeline candidates.
-
公开(公告)号:US20220358388A1
公开(公告)日:2022-11-10
申请号:US17316103
申请日:2021-05-10
发明人: Long Vu , Dharmashankar Subramanian , Peter Daniel Kirchner , Eliezer Segev Wasserkrug , Lan Ngoc Hoang , Alexander Zadorojniy
摘要: Methods and systems for generating an environment include training transformer models from tabular data and relationship information about the training data. A directed acyclic graph is generated, that includes the transformer models as nodes. The directed acyclic graph is traversed to identify a subset of transformers that are combined in order. An environment is generated using the subset of transformers.
-
公开(公告)号:US11861469B2
公开(公告)日:2024-01-02
申请号:US16919258
申请日:2020-07-02
发明人: Peter Daniel Kirchner , Gregory Bramble , Horst Cornelius Samulowitz , Dakuo Wang , Arunima Chaudhary , Gregory Filla
摘要: An embodiment of the invention may include a method, computer program product, and system for creating a data analysis tool. The method may include a computing device that generates an AI pipeline based on an input dataset, wherein the AI pipeline is generated using an Automated Machine Learning program. The method may include converting the AI pipeline to a non-native format of the Automated Machine Learning program. This may enable the AI pipeline to be used outside of the Automated Machine Learning program, thereby increasing the usefulness of the created program by not tying it to the Automated Machine Learning program. Additionally, this may increase the efficiency of running the AI pipeline by eliminating unnecessary computations performed by the Automated Machine Learning program.
-
公开(公告)号:US20230289277A1
公开(公告)日:2023-09-14
申请号:US17692268
申请日:2022-03-11
IPC分类号: G06F11/34
CPC分类号: G06F11/3452 , G06F11/3428 , G06N20/00
摘要: Computer hardware and/or software that performs the following operations: (i) assessing a performance of a plurality of unsupervised machine learning pipelines against a plurality of data sets; (ii) associating the performance with meta-features corresponding to respective pipeline/data set combinations; (iii) training a supervised meta-learning model using the associated performance and meta-features as training data; and (iv) utilizing the trained model to identify one or more pipelines for processing an input data set.
-
-
-
-
-
-
-
-
-