Concurrent binning of machine learning data

    公开(公告)号:US09672474B2

    公开(公告)日:2017-06-06

    申请号:US14489449

    申请日:2014-09-17

    CPC classification number: G06N99/005

    Abstract: Variables of observation records to be used to generate a machine learning model are identified as candidates for quantile binning transformations. In accordance with a particular concurrent binning plan generated for a particular variable, a plurality of quantile binning transformations are applied to the particular variable, including a first transformation with a first bin count and a second transformation with a different bin count. The first and second transformations result in the inclusion of respective parameters or weights for binned features in a parameter vector of the model. In a post-training phase run of the model, at least one parameter corresponding to a binned feature is used to generate a prediction.

    Optimized training of linear machine learning models

    公开(公告)号:US10318882B2

    公开(公告)日:2019-06-11

    申请号:US14484201

    申请日:2014-09-11

    Abstract: An indication of a data source to be used to train a linear prediction model is obtained. The model is to generate predictions using respective parameters assigned to a plurality of features derived from observation records of the data source. The parameter values are stored in a parameter vector. During a particular learning iteration of the training phase of the model, one or more features for which parameters are to be added to the parameter vector are identified. In response to a triggering condition, parameters for one or more features are removed from the parameter vector based on an analysis of relative contributions of the features represented in the parameter vector to the model's predictions. After the parameters are removed, at least one parameter is added to the parameter vector.

Patent Agency Ranking