Patent search ap:("SAP SE") AND inv:"Jacques Doan Huu" Page 1

1.

发明授权
Optimizing gradient boosting feature selection 有权

公开(公告)号：US12045734B2

公开(公告)日：2024-07-23

申请号：US18117923

申请日：2023-03-06

Applicant: SAP SE

Inventor： Jacques Doan Huu

IPC: G06F7/00 , G06F16/00 , G06F16/22 , G06F16/23 , G06N5/02

CPC classification number: G06N5/02 , G06F16/2228 , G06F16/2365

Abstract: Gradient Boosting Decision Tree (GBDT) successively stacks many decision trees which at each step try to fix the residual errors from the previous steps. The final score produced by the GBDT is the sum of the individual scores obtained by the decision trees for an input vector. Overfitting in GBDT can be reduced by removing the input values that have the least impact on the output from the training data. One way to determine which input variable has the lowest predictive value is to determine the input variable that is used for the first time in the latest decision tree in the GBDT. This method of identifying the low-predictive features to be removed does not require that earlier trees be regenerated to generate the new GBDT. Since the removed feature was already not used in the earlier trees, those trees already ignore the removed feature.

2.

发明授权
In-database predictive pipeline incremental engine 有权

公开(公告)号：US11521089B2

公开(公告)日：2022-12-06

申请号：US16204000

申请日：2018-11-29

Applicant: SAP SE

Inventor： Scott Kumar Cameron , Olivier Hamon , Gabriel Kevorkian , Eric Gouthiere , Jacques Doan Huu

IPC: G06N5/04 , G06N20/00 , G06F9/38

Abstract: A predictive model pipeline data store may contain electronic records defining a predictive model pipeline composed of operation nodes. Based on the information in the data store, an execution framework platform may calculate a hash value for each operation node by including all recursive dependencies using ancestor node hash values and current node parameters. The platform may then compare each computed hash value with a previously computed hash value associated with a prior execution of a prior version of the pipeline. Operation nodes that have an unchanged hash value may be tagged “idle.” Operation nodes that have a changed hash value may be tagged “train and apply” or “apply” based on current node parameters (and an “apply” tag may propagate backwards through the pipeline to ancestor nodes). The platform may then ignore the operation nodes tagged “idle” when creating a physical execution plan to be provided to a target platform.

3.

发明申请
OPTIMIZING GRADIENT BOOSTING FEATURE SELECTION 有权

公开(公告)号：US20210334667A1

公开(公告)日：2021-10-28

申请号：US16858143

申请日：2020-04-24

Applicant: SAP SE

Inventor： Jacques Doan Huu

IPC: G06N5/02 , G06F16/22 , G06F16/23

Abstract: Gradient Boosting Decision Tree (GBDT) successively stacks many decision trees which at each step try to fix the residual errors from the previous steps. The final score produced by the GBDT is the sum of the individual scores obtained by the decision trees for an input vector. Overfitting in GBDT can be reduced by removing the input values that have the least impact on the output from the training data. One way to determine which input variable has the lowest predictive value is to determine the input variable that is used for the first time in the latest decision tree in the GBDT. This method of identifying the low-predictive features to be removed does not require that earlier trees be regenerated to generate the new GBDT. Since the removed feature was already not used in the earlier trees, those trees already ignore the removed feature.

4.

发明授权
Optimizing gradient boosting feature selection 有权

公开(公告)号：US11620537B2

公开(公告)日：2023-04-04

申请号：US16858143

申请日：2020-04-24

Applicant: SAP SE

Inventor： Jacques Doan Huu

IPC: G06F7/00 , G06F16/00 , G06N5/02 , G06F16/22 , G06F16/23

Abstract: Gradient Boosting Decision Tree (GBDT) successively stacks many decision trees which at each step try to fix the residual errors from the previous steps. The final score produced by the GBDT is the sum of the individual scores obtained by the decision trees for an input vector. Overfitting in GBDT can be reduced by removing the input values that have the least impact on the output from the training data. One way to determine which input variable has the lowest predictive value is to determine the input variable that is used for the first time in the latest decision tree in the GBDT. This method of identifying the low-predictive features to be removed does not require that earlier trees be regenerated to generate the new GBDT. Since the removed feature was already not used in the earlier trees, those trees already ignore the removed feature.

5.

发明申请
AUTOMATIC MACHINE LEARNING FEATURE BACKWARD STRIPPING 有权

公开(公告)号：US20230026391A1

公开(公告)日：2023-01-26

申请号：US17956120

申请日：2022-09-29

Applicant: SAP SE

Inventor： Jacques Doan Huu

IPC: G06N20/00

Abstract: Features are used to train one or more ML models in a modelling layer. In a feature selection layer, each generated ML model is analyzed to determine, for each input feature, a degree of importance of the feature on the results generated by the ML model. Features with low importance are identified and the information is propagated backward to the data source and feature engineering layers. In response, the data source and feature engineering layers refrain from gathering or generating the unimportant features. Based on a confidence measure of the determination that each feature is important or unimportant, a number of periods between reevaluation of the feature importance is determined. After the number of periods has elapsed, a removed feature is restored to the pipeline.

6.

发明授权
Automatic generative process bridge between analytics models 有权

公开(公告)号：US11561940B2

公开(公告)日：2023-01-24

申请号：US16947563

申请日：2020-08-06

Applicant: SAP SE

Inventor： Jacques Doan Huu

IPC: G06F16/21 , G06F16/25 , G06F16/28 , G06F16/22

Abstract: Disclosed herein are system, method, and computer program product embodiments for generating a bridge between analytical models. In an embodiment, a server can extract a first variable dependency schema from a first model (e.g., predictive model or business intelligence report) and a second variable schema from a second model (e.g., predictive model or business intelligence report). The first variable dependency schema includes a first definition of a relationship between a first variable and a second variable. The server can compare the first variable dependency schema and the second variable dependency schema. Furthermore, the server can generate a modification to be made in the second variable dependency schema based on the first definition of the relationship between the first and second variable and outputs the modification to be made to the second variable dependency schema.

7.

发明公开
TIME SERIES PREDICTION EXECUTION BASED ON DEVIATION RISK EVALUATION 审中-公开

公开(公告)号：US20240202579A1

公开(公告)日：2024-06-20

申请号：US18083048

申请日：2022-12-16

Applicant: SAP SE

Inventor： Jacques Doan Huu

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: The present disclosure relates to computer-implemented methods, software, and systems for identifying data patterns based on data observations collected as time series data. A cross-validation assessment of a plurality of predictive models is performed. Based on the cross-validation assessment, a respective deviation risk is determined. The respective deviation risk is determined based on comparing forecasting variability distribution for a validation data set during the cross-validation assessment with forecasting variability distribution for test values from a test data set. The test data set represents forecasted values generated based on a respective predictive model for a future horizon. A predictive model can be excluded based on evaluating deviation risks of each of the predictive models. A model selection of a candidate model from the set of candidate predictive models is performed. The candidate model is selected based on evaluation of accuracy of the set of candidate predictive model according to the cross-validation assessment.

8.

发明授权
Declarative debriefing for predictive pipeline 有权

公开(公告)号：US11823073B2

公开(公告)日：2023-11-21

申请号：US16190518

申请日：2018-11-14

Applicant: SAP SE

Inventor： Jacques Doan Huu

IPC: G06N5/04 , G06N20/00

CPC classification number: G06N5/04 , G06N20/00

Abstract: Provided are systems and methods for auto-completing debriefing processing for a machine learning model pipeline based on a type of predictive algorithm. In one example, the method may include one or more of building a machine learning model pipeline via a user interface, detecting, via the user interface, a selection associated with a predictive algorithm included within the machine learning model pipeline, in response to the selection, identifying debriefing components for the predictive algorithm based on a type of the predictive algorithm from among a plurality of types of predictive algorithms, and automatically incorporating processing for the debriefing components within the machine learning model pipeline such that values of the debriefing components are generated during training of the predictive algorithm within the machine learning model pipeline.

9.

发明公开
OPTIMIZING GRADIENT BOOSTING FEATURE SELECTION 审中-公开

公开(公告)号：US20230206083A1

公开(公告)日：2023-06-29

申请号：US18117923

申请日：2023-03-06

Applicant: SAP SE

Inventor： Jacques Doan Huu

IPC: G06N5/02 , G06F16/22 , G06F16/23

CPC classification number: G06N5/02 , G06F16/2228 , G06F16/2365

Abstract: Gradient Boosting Decision Tree (GBDT) successively stacks many decision trees which at each step try to fix the residual errors from the previous steps. The final score produced by the GBDT is the sum of the individual scores obtained by the decision trees for an input vector. Overfitting in GBDT can be reduced by removing the input values that have the least impact on the output from the training data. One way to determine which input variable has the lowest predictive value is to determine the input variable that is used for the first time in the latest decision tree in the GBDT. This method of identifying the low-predictive features to be removed does not require that earlier trees be regenerated to generate the new GBDT. Since the removed feature was already not used in the earlier trees, those trees already ignore the removed feature.

10.

发明授权
Automatic machine learning feature backward stripping 有权

公开(公告)号：US11494699B2

公开(公告)日：2022-11-08

申请号：US16868145

申请日：2020-05-06

Applicant: SAP SE

Inventor： Jacques Doan Huu

IPC: G06F15/00 , G06F3/12 , G06K1/00 , G06N20/00

Abstract: Features are used to train one or more ML models in a modelling layer. In a feature selection layer, each generated ML model is analyzed to determine, for each input feature, a degree of importance of the feature on the results generated by the ML model. Features with low importance are identified and the information is propagated backward to the data source and feature engineering layers. In response, the data source and feature engineering layers refrain from gathering or generating the unimportant features. Based on a confidence measure of the determination that each feature is important or unimportant, a number of periods between reevaluation of the feature importance is determined. After the number of periods has elapsed, a removed feature is restored to the pipeline.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification