-
公开(公告)号:US20220100721A1
公开(公告)日:2022-03-31
申请号:US17549395
申请日:2021-12-13
Applicant: Amazon Technologies, Inc.
Inventor: Nina Mishra , Daniel Blick , Sudipto Guha , Okke Joost Schrijvers
IPC: G06F16/215 , G06N5/00 , G06N20/00
Abstract: Random cut trees are generated with respective to respective samples of a baseline set of data records of a data set for which outlier detection is to be performed. To construct a particular random cut tree, an iterative splitting technique is used, in which the attribute along which a given set of data records is split is selected based on its value range. With respect to a newly-received data record of the stream, an outlier score is determined based at least partly on a potential insertion location of a node representing the data record in a particular random cut tree, without necessarily modifying the random cut tree.
-
公开(公告)号:US12174807B2
公开(公告)日:2024-12-24
申请号:US17549395
申请日:2021-12-13
Applicant: Amazon Technologies, Inc.
Inventor: Nina Mishra , Daniel Blick , Sudipto Guha , Okke Joost Schrijvers
IPC: G06F16/00 , G06F16/215 , G06N5/01 , G06N20/00 , G06F16/2458
Abstract: Random cut trees are generated with respective to respective samples of a baseline set of data records of a data set for which outlier detection is to be performed. To construct a particular random cut tree, an iterative splitting technique is used, in which the attribute along which a given set of data records is split is selected based on its value range. With respect to a newly-received data record of the stream, an outlier score is determined based at least partly on a potential insertion location of a node representing the data record in a particular random cut tree, without necessarily modifying the random cut tree.
-
公开(公告)号:US11232085B2
公开(公告)日:2022-01-25
申请号:US14990175
申请日:2016-01-07
Applicant: Amazon Technologies, Inc.
Inventor: Nina Mishra , Daniel Blick , Sudipto Guha , Okke Joost Schrijvers
IPC: G06F16/215 , G06N5/00 , G06N20/00 , G06F16/2458
Abstract: Random cut trees are generated with respective to respective samples of a baseline set of data records of a data set for which outlier detection is to be performed. To construct a particular random cut tree, an iterative splitting technique is used, in which the attribute along which a given set of data records is split is selected based on its value range. With respect to a newly-received data record of the stream, an outlier score is determined based at least partly on a potential insertion location of a node representing the data record in a particular random cut tree, without necessarily modifying the random cut tree.
-
公开(公告)号:US10318882B2
公开(公告)日:2019-06-11
申请号:US14484201
申请日:2014-09-11
Applicant: Amazon Technologies, Inc.
Inventor: Michael Brueckner , Daniel Blick
Abstract: An indication of a data source to be used to train a linear prediction model is obtained. The model is to generate predictions using respective parameters assigned to a plurality of features derived from observation records of the data source. The parameter values are stored in a parameter vector. During a particular learning iteration of the training phase of the model, one or more features for which parameters are to be added to the parameter vector are identified. In response to a triggering condition, parameters for one or more features are removed from the parameter vector based on an analysis of relative contributions of the features represented in the parameter vector to the model's predictions. After the parameters are removed, at least one parameter is added to the parameter vector.
-
-
-