OUTLIER DETECTION FOR STREAMING DATA

    公开(公告)号:US20220100721A1

    公开(公告)日:2022-03-31

    申请号:US17549395

    申请日:2021-12-13

    Abstract: Random cut trees are generated with respective to respective samples of a baseline set of data records of a data set for which outlier detection is to be performed. To construct a particular random cut tree, an iterative splitting technique is used, in which the attribute along which a given set of data records is split is selected based on its value range. With respect to a newly-received data record of the stream, an outlier score is determined based at least partly on a potential insertion location of a node representing the data record in a particular random cut tree, without necessarily modifying the random cut tree.

    Outlier detection for streaming data

    公开(公告)号:US12174807B2

    公开(公告)日:2024-12-24

    申请号:US17549395

    申请日:2021-12-13

    Abstract: Random cut trees are generated with respective to respective samples of a baseline set of data records of a data set for which outlier detection is to be performed. To construct a particular random cut tree, an iterative splitting technique is used, in which the attribute along which a given set of data records is split is selected based on its value range. With respect to a newly-received data record of the stream, an outlier score is determined based at least partly on a potential insertion location of a node representing the data record in a particular random cut tree, without necessarily modifying the random cut tree.

    Outlier detection for streaming data

    公开(公告)号:US11232085B2

    公开(公告)日:2022-01-25

    申请号:US14990175

    申请日:2016-01-07

    Abstract: Random cut trees are generated with respective to respective samples of a baseline set of data records of a data set for which outlier detection is to be performed. To construct a particular random cut tree, an iterative splitting technique is used, in which the attribute along which a given set of data records is split is selected based on its value range. With respect to a newly-received data record of the stream, an outlier score is determined based at least partly on a potential insertion location of a node representing the data record in a particular random cut tree, without necessarily modifying the random cut tree.

    Optimized training of linear machine learning models

    公开(公告)号:US10318882B2

    公开(公告)日:2019-06-11

    申请号:US14484201

    申请日:2014-09-11

    Abstract: An indication of a data source to be used to train a linear prediction model is obtained. The model is to generate predictions using respective parameters assigned to a plurality of features derived from observation records of the data source. The parameter values are stored in a parameter vector. During a particular learning iteration of the training phase of the model, one or more features for which parameters are to be added to the parameter vector are identified. In response to a triggering condition, parameters for one or more features are removed from the parameter vector based on an analysis of relative contributions of the features represented in the parameter vector to the model's predictions. After the parameters are removed, at least one parameter is added to the parameter vector.

Patent Agency Ranking