SYSTEM AND METHOD FOR EFFICIENT TRANSFORMATION PREDICTION IN A DATA ANALYTICS PREDICTION MODEL PIPELINE

    公开(公告)号:US20230359941A1

    公开(公告)日:2023-11-09

    申请号:US17739716

    申请日:2022-05-09

    IPC分类号: G06N20/20 G06Q20/40

    CPC分类号: G06N20/20 G06Q20/4016

    摘要: A computer-implemented system, platform, programing product, and/or method for improving transformation selection in an ensemble machine learning (ML) model that includes: providing all base ML models of the ensemble ML model; identifying all of a plurality of Derived Fields in all the base ML models; performing a Derived Field run prediction analysis for all the Derived Fields; computing the Derived Field Importance Weight for Field (DFIW4F) and the Derived Field Importance Weight for Model (DFIW4M) for all the Derived Fields; clustering all the Derived Fields into a plurality of Derived Field clusters, wherein each Derived Field cluster is based upon the DFIW4M and the DFIW4F for the Derived Field; sorting all the Derived Field clusters by best cluster based upon DFIW4M and DFIW4F; and running the base ML models based upon the Derived Fields in the best Derived Field cluster until sufficient base ML models have been run.

    OPTIMIZED PREDICTION OF TREE ENSEMBLE

    公开(公告)号:US20230132789A1

    公开(公告)日:2023-05-04

    申请号:US17519156

    申请日:2021-11-04

    IPC分类号: G06N20/20

    摘要: Embodiments of the present disclosure relate to methods, systems, and computer program products for optimized prediction of a tree ensemble. According to a method, an input request is received, which indicates a plurality of input values for a plurality of variables associated with a tree ensemble. A plurality of target transformed intervals, into which the plurality of input values fall respectively, are determined by matching the plurality of input values with a plurality of sets of transformed intervals for the plurality of variables respectively. Respective prediction results for a plurality of tree models of the tree ensemble are determined based on the plurality of target transformed intervals and respective node hierarchies of the plurality of tree models. A tree ensemble prediction result is determined for the input request based on the determined prediction results of the plurality of tree models.

    TARGET CLASS ANALYSIS HEURISTICS
    3.
    发明申请

    公开(公告)号:US20210081767A1

    公开(公告)日:2021-03-18

    申请号:US16574163

    申请日:2019-09-18

    IPC分类号: G06N3/04 G06K9/62

    摘要: A set of classifiable data containing a plurality of classes is ingested. A target class within the plurality of classes is determined. Using the set of classifiable data, an interactive recall rate chart is generated, and the interactive recall rate chart shows a set of target class recall rates against a set of class recall rates for the remainder of the plurality of classes. The interactive recall rate chart is presented to a user. A target class recall rate selection from the set of target class recall rates is received from the user. The set of classifiable data is reclassified, based on the target class recall rate selection.