-
公开(公告)号:US10778707B1
公开(公告)日:2020-09-15
申请号:US15153712
申请日:2016-05-12
Applicant: Amazon Technologies, Inc.
Inventor: Rajeev Ramnarain Rastogi
IPC: H04L29/06 , G06F16/22 , G06F16/901 , G06F16/2455
Abstract: A matching record set with respect to a particular data record of a stream is identified based on output values produced by a particular band of locality sensitive hash functions. Using respective matching record sets corresponding to the particular data record and one or more other bands of locality sensitive hash functions, an estimate of a count of data records of the stream which meet a particular inter-record distance criterion is obtained. A determination as to whether the particular data record is to be designated as an outlier with respect to previously-observed records of the data stream is made using the estimated count.
-
公开(公告)号:US10089675B1
公开(公告)日:2018-10-02
申请号:US14918445
申请日:2015-10-20
Applicant: Amazon Technologies, Inc.
Inventor: Rajeev Ramnarain Rastogi , Varnit Agnihotri , Rushi Bhatt , Srujana Merugu
Abstract: Data mining systems and methods are disclosed for associating users with items based on underlying personas. The system associates each user account with one or more underlying personas that contribute to the user's interactions with different items, and models user-to-item associations in accordance with the underlying personas based on probabilistic matrix factorization. The system may further predict an active persona for a user based on the user's recent interactions with items and make item related recommendations that are oriented to the active persona.
-
公开(公告)号:US11556945B1
公开(公告)日:2023-01-17
申请号:US15714816
申请日:2017-09-25
Applicant: Amazon Technologies, Inc.
Inventor: Karthik Sundaresan Gurumoorthy , Vineet Shashikant Chaoji , Dinesh Mandalapu , Rajeev Ramnarain Rastogi
Abstract: Systems and methods are disclosed to implement an item metric prediction system that predicts a metric for an item using a feature-based model built using other similar items. In embodiments, the system is used to predict item influence values (IIVs) of items indicating an expected amount of subsequent transactions that is caused by an initial transaction of the items. In embodiments, a sample of item transaction data is distributed to a plurality of task nodes, which execute in parallel to determine the items' observed IIVs from the transaction data. Subsequently, a new IIV is determined for an item whose observed IIV has a low confidence level. A set of similar items is selected, and a set of parameters of a feature-based model are tuned to fit the model to the observed IIVs of the similar items. A new IIV having a high confidence level is then obtained using the model.
-
公开(公告)号:US10963812B1
公开(公告)日:2021-03-30
申请号:US15462556
申请日:2017-03-17
Applicant: Amazon Technologies, Inc.
Inventor: Vivek Sembium Varadarajan , Rajeev Ramnarain Rastogi , Atul Saroop
Abstract: Some aspects of the present disclosure relate to computer processes for generating and training a generative machine learning model to estimate the true sizes of items and users of an electronic catalog and subsequently applied to determine fit recommendations, as well as confidence values for the fit recommendations, for how a particular item may fit a particular user. During training, the disclosed generative model can implement Bayesian statistical inference to calculate estimated true sizes of both items and users of an electronic catalog using both (1) a prior distribution of sizes for items and users and (2) a distribution based on obtained evidence regarding how items actually fit users. The resulting posterior distribution can be approximated using a proposal distribution used to generate the fit recommendations and associated confidence values.
-
公开(公告)号:US10157351B1
公开(公告)日:2018-12-18
申请号:US14918444
申请日:2015-10-20
Applicant: Amazon Technologies, Inc.
Inventor: Rajeev Ramnarain Rastogi , Varnit Agnihotri , Rushi Bhatt , Srujana Merugu
Abstract: Data mining systems and methods are disclosed for associating users with items based on underlying personas. The system associates each user account with one or more underlying personas that contribute to the user's interactions with different items, and predicts an active persona for a user based on the user's recent interactions with items and make item related recommendations that are oriented to the active persona. Thus, for example, even though multiple individuals may share a computer and/or account, the content (e.g., item recommendations) presented during a browsing session may be based primarily or exclusively on the past browsing behaviors of the particular individual conducting the browsing session.
-
公开(公告)号:US11915104B2
公开(公告)日:2024-02-27
申请号:US16672243
申请日:2019-11-01
Applicant: Amazon Technologies, Inc.
Abstract: Respective correlation metrics between token groups of a particular text attribute of a data set and a prediction target attribute are computed. Based on the correlation metrics, a predictive token group list is created. For various observation records of the data set, values of a derived categorical attribute corresponding to the particular text attribute are determined based on matches between the particular text attribute value and the predictive token group list. A measure of the predictive utility of the particular text attribute is obtained using correlations between the categorical attribute and the prediction target attribute.
-
公开(公告)号:US11295229B1
公开(公告)日:2022-04-05
申请号:US15132959
申请日:2016-04-19
Applicant: Amazon Technologies, Inc.
Inventor: Pooja Ashok Kumar , Naveen Sudhakaran Nair , Rajeev Ramnarain Rastogi
Abstract: An approximate count of a subset of records of a data set is obtained using one or more transformation functions. The subset comprises records which contain a first value of one input variable, a second value of another input variable, and a particular value of a target variable. Using the approximate count, an approximate correlation metric for a multidimensional feature and the target variable is obtained. Based on the correlation metric, the multidimensional feature is included in a candidate feature set to be used to train a machine learning model.
-
公开(公告)号:US10467547B1
公开(公告)日:2019-11-05
申请号:US14935426
申请日:2015-11-08
Applicant: Amazon Technologies, Inc.
Abstract: Respective correlation metrics between token groups of a particular text attribute of a data set and a prediction target attribute are computed. Based on the correlation metrics, a predictive token group list is created. For various observation records of the data set, values of a derived categorical attribute corresponding to the particular text attribute are determined based on matches between the particular text attribute value and the predictive token group list. A measure of the predictive utility of the particular text attribute is obtained using correlations between the categorical attribute and the prediction target attribute.
-
9.
公开(公告)号:US10380498B1
公开(公告)日:2019-08-13
申请号:US14720474
申请日:2015-05-22
Applicant: Amazon Technologies, Inc.
Inventor: Vineet Shashikant Chaoji , Aswin Natarajan , Seyit Ismail Parsa , Rajeev Ramnarain Rastogi
Abstract: This disclosure is directed to the automated generation of Machine Learning (ML) models. The system receives a user directive containing one or more requirements for building the ML model. The system further identifies common requirements between the user directive and one or more prior user directives and associates characteristics of the prior user directive, or model generated therefrom, with the user directive. The system further associates performance values generated by continuous monitoring of deployed ML models to individual characteristics of the user directive used to generate each of the deployed ML models. The system continuously improves model generation efficiency, model performance, and first run performance of individual ML models by learning from the improvements made to one or more prior ML models having similar characteristics.
-
公开(公告)号:US20240185130A1
公开(公告)日:2024-06-06
申请号:US18416755
申请日:2024-01-18
Applicant: Amazon Technologies, Inc.
Abstract: Respective correlation metrics between token groups of a particular text attribute of a data set and a prediction target attribute are computed. Based on the correlation metrics, a predictive token group list is created. For various observation records of the data set, values of a derived categorical attribute corresponding to the particular text attribute are determined based on matches between the particular text attribute value and the predictive token group list. A measure of the predictive utility of the particular text attribute is obtained using correlations between the categorical attribute and the prediction target attribute.
-
-
-
-
-
-
-
-
-