-
公开(公告)号:US12056583B2
公开(公告)日:2024-08-06
申请号:US16938998
申请日:2020-07-26
Applicant: Amazon Technologies, Inc.
Inventor: Saman Zarandioon , Robert Matthias Steele
Abstract: Respective statistical distributions of a target variable within a proposed training data set and a proposed test data set for a machine learning model are obtained. A metric indicative of the difference between the two statistical distributions is computed. The difference metric is used to determine whether the proposed test data set is acceptable to evaluate the machine learning model.
-
公开(公告)号:US10339465B2
公开(公告)日:2019-07-02
申请号:US14463434
申请日:2014-08-19
Applicant: Amazon Technologies, Inc.
Inventor: Robert Matthias Steele , Tarun Agarwal , Leo Parker Dirac , Jun Qian
Abstract: During a training phase of a machine learning model, representations of at least some nodes of a decision tree are generated and stored on persistent storage in depth-first order. A respective predictive utility metric (PUM) value is determined for one or more nodes, indicating expected contributions of the nodes to a prediction of the model. A particular node is selected for removal from the tree based at least partly on its PUM value. A modified version of the tree, with the particular node removed, is stored for obtaining a prediction.
-
公开(公告)号:US10713589B1
公开(公告)日:2020-07-14
申请号:US15060439
申请日:2016-03-03
Applicant: Amazon Technologies, Inc.
Inventor: Saman Zarandioon , Nicolle M. Correa , Leo Parker Dirac , Aleksandr Mikhaylovich Ingerman , Steven Andrew Loeppky , Robert Matthias Steele , Tianming Zheng
IPC: G06N20/00
Abstract: A determination that a machine learning data set is to be shuffled is made. Tokens corresponding to the individual observation records are generated based on respective identifiers of the records' storage objects and record key values. Respective representative values are derived from the tokens. The observation records are rearranged based on a result of sorting the representative values and provided to a shuffle result destination.
-
公开(公告)号:US10366053B1
公开(公告)日:2019-07-30
申请号:US14950953
申请日:2015-11-24
Applicant: Amazon Technologies, Inc.
Inventor: Tianming Zheng , Nicolle M. Correa , Leo Parker Dirac , James Joseph Jesensky , Robert Matthias Steele
Abstract: A request to split a data set comprising observation records located in a group of storage objects is received. With respect to a particular observation record, a token is generated based on an identifier of the record's storage object and a key value of the record. A numeric value is calculated using the token, and the observation record is assigned to a split subset using the numeric value. An indication of the assignment is provided to a destination associated with the split subset.
-
公开(公告)号:US11284041B1
公开(公告)日:2022-03-22
申请号:US15841240
申请日:2017-12-13
Applicant: Amazon Technologies, Inc.
Inventor: Alessandro Bergamo , Pahal Kamlesh Dalal , Nishitkumar Ashokkumar Desai , Jayakrishnan Kumar Eledath , Marian Nasr Amin George , Jean Laurent Guigues , Gerard Guy Medioni , Kartik Muktinutalapati , Robert Matthias Steele , Lu Xia
Abstract: In a materials handling facility, events may be associated with users based on imaging data captured from multiple fields of view. When an event is detected at a location within the fields of view of multiple cameras, two or more of the cameras may be identified as having captured images of the location at a time of the event. Users within the materials handling facility may be identified from images captured prior to, during or after the event, and visual representations of the respective actors may be generated from the images. The event may be associated with one of the users based on distances between the users' hands and the location of the event, as determined from the visual representations, or based on imaging data captured from the users' hands, which may be processed to determine which, if any, of such hands includes an item associated with the event.
-
公开(公告)号:US11030442B1
公开(公告)日:2021-06-08
申请号:US15841237
申请日:2017-12-13
Applicant: Amazon Technologies, Inc.
Inventor: Alessandro Bergamo , Pahal Kamlesh Dalal , Nishitkumar Ashokkumar Desai , Jayakrishnan Kumar Eledath , Marian Nasr Amin George , Jean Laurent Guigues , Gerard Guy Medioni , Kartik Muktinutalapati , Robert Matthias Steele , Lu Xia
Abstract: In a materials handling facility, events may be associated with users based on imaging data captured from multiple fields of view. When an event is detected at a location within the fields of view of multiple cameras, two or more of the cameras may be identified as having captured images of the location at a time of the event. Users within the materials handling facility may be identified from images captured prior to, during or after the event, and visual representations of the respective actors may be generated from the images. The event may be associated with one of the users based on distances between the users' hands and the location of the event, as determined from the visual representations, or based on imaging data captured from the users' hands, which may be processed to determine which, if any, of such hands includes an item associated with the event.
-
公开(公告)号:US10726356B1
公开(公告)日:2020-07-28
申请号:US15225545
申请日:2016-08-01
Applicant: Amazon Technologies, Inc.
Inventor: Saman Zarandioon , Robert Matthias Steele
Abstract: Respective statistical distributions of a target variable within a proposed training data set and a proposed test data set for a machine learning model are obtained. A metric indicative of the difference between the two statistical distributions is computed. The difference metric is used to determine whether the proposed test data set is acceptable to evaluate the machine learning model.
-
-
-
-
-
-