FEATURE CATALOG ENHANCEMENT THROUGH AUTOMATED FEATURE CORRELATION

    公开(公告)号:US20210357803A1

    公开(公告)日:2021-11-18

    申请号:US16876278

    申请日:2020-05-18

    IPC分类号: G06N20/00 G06N5/04 G06N5/02

    摘要: Embodiments relate to a system, program product, and method for generating an enhanced feature catalog for a predictive model. The embodiments disclosed herein include capturing predictive model design time information including training data lineage metadata to determine the features of the training data, model design time measurements, and model design time metadata. Once the predictive model is built, the training data lineage metadata is used to capture the features that will be maintained within a feature catalog. The model design time measurements and model design time metadata provide further correlation between the predictive model and the features. Runtime metrics on the predictive model create additional correlations between the captured data and metadata with the features in the feature catalog to expeditiously identify the relevant features of the predictive model.

    MACHINE LEARNING MODEL ACCURACY FAIRNESS

    公开(公告)号:US20210287131A1

    公开(公告)日:2021-09-16

    申请号:US16814603

    申请日:2020-03-10

    IPC分类号: G06N20/00 G06N5/04

    摘要: A system includes a memory having instructions therein and at least one processor in communication with the memory. The at least one processor is configured to execute the instructions to run a machine learning base model on input data to generate base model prediction data and run a machine learning error prediction model on the input data to generate error prediction data. The at least one processor is configured to execute the instructions to generate predicted correct base model prediction data based on the base model prediction data and the error prediction data. The at least one processor is configured to execute the instructions to generate confusion values data based on the base model prediction data and the predicted correct base model prediction data. The at least one processor is also configured to execute the instructions to generate base model accuracy fairness metrics data based on the confusion values data.

    EXPLAINING ACCURACY DRIFT IN PRODUCTION DATA

    公开(公告)号:US20210279607A1

    公开(公告)日:2021-09-09

    申请号:US16813512

    申请日:2020-03-09

    摘要: A computer-implemented method according to one embodiment includes identifying an occurrence of accuracy drift by a trained model; identifying data associated with the accuracy drift, utilizing a drift detection model (DDM) constructed for the trained model; applying the data associated with the accuracy drift to a decision tree to determine a feature space and specific subset of the data causing the accuracy drift; analyzing a distribution of features within the feature space for the specific subset of the data causing the accuracy drift to determine specific features of the data causing the accuracy drift; and returning the specific features of the data causing the accuracy drift.

    AUTOMATED FEEDBACK-BASED APPLICATION OPTIMIZATION

    公开(公告)号:US20210004311A1

    公开(公告)日:2021-01-07

    申请号:US16460182

    申请日:2019-07-02

    IPC分类号: G06F11/36 G06N20/00

    摘要: Approaches presented herein enable optimization of a developing application to a user base. More specifically, application-centric data is gathered during a cultivation phase of the developing application. Substantially concurrently with the cultivation phase of the developing application, the application-centric data is analyzed according to static code of the developing application, a testing of the developing application, or a user experience (UX) design of the developing application. A machine learning model is applied to the analyzed application-centric data. This machine learning model is trained on historic application feedback data from applications available to the user base. Based on the machine learning model, a recommended change to optimize the developing application to the user base is generated.

    CORRECTING A CLASSIFICATION MODEL
    58.
    发明公开

    公开(公告)号:US20240249187A1

    公开(公告)日:2024-07-25

    申请号:US18158190

    申请日:2023-01-23

    IPC分类号: G06N20/00

    CPC分类号: G06N20/00

    摘要: Provided are techniques for correcting a classification model. For each original record of a plurality of original records that are processed by a classification model: the original record is perturbed; for the original record, an original confidence value is obtained for each class of a plurality of classes; for the perturbed record, a perturbed confidence value is obtained for each class of the plurality of classes; a final confidence value is determined using each original confidence value, each perturbed confidence value, and a direction of distance travelled; and a determination is made of whether the original record is biased based on the final confidence value. Then, it is determined whether the classification model is biased based on the original records that are determined to be biased. In response to determining that the classification model is biased, the classification model is corrected, otherwise, the classification model is deployed.

    Batch scoring model fairness
    60.
    发明授权

    公开(公告)号:US12014287B2

    公开(公告)日:2024-06-18

    申请号:US17111757

    申请日:2020-12-04

    IPC分类号: G06N5/04 G06F16/23 G06N20/00

    摘要: A system and related method score a fairness of an outcome model. The method comprises receiving a set of original transaction records (OTRs), and selecting an OTR subset of the OTRs according to a subset selection criteria in order to reduce a number of OTRs to send to outcome model. For each OTR in the subset a perturbed transaction record (PTR) is created based on the OTR that includes changing at least one attribute in the PTR from the OTR, sending the OTR and the PTR to the outcome model, receiving an OTR outcome and a PTR outcome from the outcome model, and determining a record bias score for the OTR outcome and the PTR outcome respectively that indicates bias in the respective outcome. The OTR and the PTR bias score are stored in a bias determination system (BDS) database.