Confidence score generation for boosting-based tree machine learning models

    公开(公告)号:US11816550B1

    公开(公告)日:2023-11-14

    申请号:US16933215

    申请日:2020-07-20

    CPC classification number: G06N20/20 G06N5/01 G06N5/04

    Abstract: Devices and techniques are generally described for generating confidence scores for boosting-based tree machine learning models. In various examples, a first record comprising a plurality of input variables may be received. In another example, a boosting-based tree machine learning model may generate, for the first record, a base model score. In various examples, the base model score may be generated based on the plurality of input variables and the base model score may represent a likelihood that the first record is associated with a first class. In some examples, a score confidence machine learning model may generate a confidence score for the first record. The confidence score may indicate a confidence in the base model score. In various examples, the first record may be processed based at least in part on the confidence score.

    NORMALIZING TEXT ATTRIBUTES FOR MACHINE LEARNING MODELS

    公开(公告)号:US20200065710A1

    公开(公告)日:2020-02-27

    申请号:US16672243

    申请日:2019-11-01

    Abstract: Respective correlation metrics between token groups of a particular text attribute of a data set and a prediction target attribute are computed. Based on the correlation metrics, a predictive token group list is created. For various observation records of the data set, values of a derived categorical attribute corresponding to the particular text attribute are determined based on matches between the particular text attribute value and the predictive token group list. A measure of the predictive utility of the particular text attribute is obtained using correlations between the categorical attribute and the prediction target attribute.

    Machine learning system for annotating unstructured text

    公开(公告)号:US10380236B1

    公开(公告)日:2019-08-13

    申请号:US15712933

    申请日:2017-09-22

    Abstract: Systems and methods are disclosed to implement a machine learning system that is trained to assign annotations to text fragments in an unstructured sequence of text. The system employs a neural model that includes an encoder recurrent neural network (RNN) and a decoder RNN. The input text sequence is encoded by the encoder RNN into successive encoder hidden states. The encoder hidden states are then decoded by the decoder RNN to produce a sequence of annotations for text fragments within the text sequence. In embodiments, the system employs a fixed-attention window during the decoding phase to focus on a subset of encoder hidden states to generate the annotations. In embodiments, the system employs a beam search technique to track a set of candidate annotation sequences before the annotations are outputted. By using a decoder RNN, the neural model is better equipped to capture long-range annotation dependencies in the text sequence.

Patent Agency Ranking