-
1.
公开(公告)号:US11989657B2
公开(公告)日:2024-05-21
申请号:US17071285
申请日:2020-10-15
Applicant: Oracle International Corporation
Inventor: Nikan Chavoshi , Anatoly Yakovlev , Hesam Fathi Moghadam , Venkatanathan Varadarajan , Sandeep Agrawal , Ali Moharrer , Jingxiao Cai , Sanjay Jinturkar , Nipun Agarwal
Abstract: Herein, a computer generates and evaluates many preprocessor configurations for a window preprocessor that transforms a training timeseries dataset for an ML model. With each preprocessor configuration, the window preprocessor is configured. The window preprocessor then converts the training timeseries dataset into a configuration-specific point-based dataset that is based on the preprocessor configuration. The ML model is trained based on the configuration-specific point-based dataset to calculate a score for the preprocessor configuration. Based on the scores of the many preprocessor configurations, an optimal preprocessor configuration is selected for finally configuring the window preprocessor, after which, the window preprocessor can optimally transform a new timeseries dataset such as in an offline or online production environment such as for real-time processing of a live streaming timeseries.
-
公开(公告)号:US20210390466A1
公开(公告)日:2021-12-16
申请号:US17086204
申请日:2020-10-30
Applicant: Oracle International Corporation
Inventor: Venkatanathan Varadarajan , Sandeep R. Agrawal , Hesam Fathi Moghadam , Anatoly Yakovlev , Ali Moharrer , Jingxiao Cai , Sanjay Jinturkar , Nipun Agarwal , Sam Idicula , Nikan Chavoshi
Abstract: A proxy-based automatic non-iterative machine learning (PANI-ML) pipeline is described, which predicts machine learning model configuration performance and outputs an automatically-configured machine learning model for a target training dataset. Techniques described herein use one or more proxy models—which implement a variety of machine learning algorithms and are pre-configured with tuned hyperparameters—to estimate relative performance of machine learning model configuration parameters at various stages of the PANI-ML pipeline. The PANI-ML pipeline implements a radically new approach of rapidly narrowing the search space for machine learning model configuration parameters by performing algorithm selection followed by algorithm-specific adaptive data reduction (i.e., row- and/or feature-wise dataset sampling), and then hyperparameter tuning. Furthermore, because of the one-pass nature of the PANI-ML pipeline and because each stage of the pipeline has convergence criteria by design, the whole PANI-ML pipeline has a novel convergence property that stops the configuration search after one pass.
-
3.
公开(公告)号:US20220121955A1
公开(公告)日:2022-04-21
申请号:US17071285
申请日:2020-10-15
Applicant: Oracle International Corporation
Inventor: Nikan Chavoshi , Anatoly Yakovlev , Hesam Fathi Moghadam , Venkatanathan Varadarajan , Sandeep Agrawal , Ali Moharrer , Jingxiao Cai , Sanjay Jinturkar , Nipun Agarwal
Abstract: Herein, a computer generates and evaluates many preprocessor configurations for a window preprocessor that transforms a training timeseries dataset for an ML model. With each preprocessor configuration, the window preprocessor is configured. The window preprocessor then converts the training timeseries dataset into a configuration-specific point-based dataset that is based on the preprocessor configuration. The ML model is trained based on the configuration-specific point-based dataset to calculate a score for the preprocessor configuration. Based on the scores of the many preprocessor configurations, an optimal preprocessor configuration is selected for finally configuring the window preprocessor, after which, the window preprocessor can optimally transform a new timeseries dataset such as in an offline or online production environment such as for real-time processing of a live streaming timeseries.
-
-