-
公开(公告)号:US20210097350A1
公开(公告)日:2021-04-01
申请号:US16584082
申请日:2019-09-26
Applicant: Adobe Inc.
Inventor: Georgios Theocharous , Zheng Wen , Yasin Abbasi Yadkori , Qingyun Wu
Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing offline models to warm start online bandit learner models. For example, the disclosed system can determine relevant offline models for an environment based on reward estimate differences between the offline models and the online model. The disclosed system can then utilize the relevant offline models (if any) to select an arm for the environment. The disclosed system can update the online model based on observed rewards for the selected arm. Additionally, the disclosed system can also use entropy reduction of arms to determine the utility of the arms in differentiating relevant and irrelevant offline models. For example, the disclosed system can select an arm based on a combination of the entropy reduction of the arm and the reward estimate for the arm and use the observed reward to update an observation history.
-
2.
公开(公告)号:US20190303994A1
公开(公告)日:2019-10-03
申请号:US15940736
申请日:2018-03-29
Applicant: Adobe Inc.
Inventor: Matteo Sesia , Yasin Abbasi Yadkori
Abstract: Recommendation systems and techniques are described that use linear stochastic bandits and confidence interval generation to generate recommendations for digital content. These techniques overcome the limitations of conventional recommendations systems that are limited to a fixed parameter to estimate noise and thus do not fully exploit available data and are overly conservative, at a significant cost in operational performance of a computing device. To do so, a linear model, noise estimate, and confidence interval are refined by a recommendation system based on user interaction data that describes a result of user interaction with items of digital content. This is performed by comparing a result of the recommendation on user interaction with digital content with an estimate of a result of the recommendation.
-
公开(公告)号:US20200314472A1
公开(公告)日:2020-10-01
申请号:US16367628
申请日:2019-03-28
Applicant: Adobe Inc.
Inventor: Anup Rao , Yasin Abbasi Yadkori , Tung Mai , Ryan Rossi , Ritwik Sinha , Matvey Kapilevich , Alexandru Ionut Hodorogea
IPC: H04N21/258 , H04N21/2668 , H04N21/482
Abstract: The present disclosure relates to training a recommendation model to generate trait recommendations using one permutation hashing and populated-value-slot-based densification. In particular, the disclosed systems can train the recommendation model by computing sketch vectors corresponding to traits using one permutation hashing. The disclosed systems can then fill in unpopulated value slots of the sketch vectors using populated-value-slot-based densification. The disclosed systems can combine the resulting densified sketches to generate the trained recommendation model. For example, in some embodiments, the disclosed systems can combine the sketches by generating a plurality of locality sensitive hashing tables based on the sketches. In some embodiments, the disclosed systems generate a count sketch matrix based on the sketches and generate trait embeddings based on the count sketch matrix using spectral embedding. Based on the trait embeddings, the disclosed systems can utilize the recommendation model to flexibly and accurately determine the similarity between traits.
-
公开(公告)号:US20190303995A1
公开(公告)日:2019-10-03
申请号:US15943807
申请日:2018-04-03
Applicant: Adobe Inc.
Inventor: Shuai Li , Zheng Wen , Yasin Abbasi Yadkori , Vishwa Vinay , Branislav Kveton
Abstract: The present disclosure is directed toward systems, methods, and computer readable media for training and utilizing an item-level importance sampling model to evaluate and execute digital content selection policies. For example, systems described herein include training and utilizing an item-level importance sampling model that accurately and efficiently predicts a performance value that indicates a probability that a target user will interact with ranked lists of digital content items provided in accordance with a target digital content selection policy. Specifically, systems described herein can perform an offline evaluation of a target policy in light of historical user interactions corresponding to a training digital content selection policy to determine item-level importance weights that account for differences in digital content item distributions between the training policy and the target policy. In addition, the systems described herein can apply the item-level importance weights to training data to train item-level importance sampling model.
-
公开(公告)号:US20230259829A1
公开(公告)日:2023-08-17
申请号:US18306449
申请日:2023-04-25
Applicant: Adobe Inc.
Inventor: Georgios Theocharous , Zheng Wen , Yasin Abbasi Yadkori , Qingyun Wu
CPC classification number: G06N20/00 , G06N5/04 , G06F18/2193
Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing offline models to warm start online bandit learner models. For example, the disclosed system can determine relevant offline models for an environment based on reward estimate differences between the offline models and the online model. The disclosed system can then utilize the relevant offline models (if any) to select an arm for the environment. The disclosed system can update the online model based on observed rewards for the selected arm. Additionally, the disclosed system can also use entropy reduction of arms to determine the utility of the arms in differentiating relevant and irrelevant offline models. For example, the disclosed system can select an arm based on a combination of the entropy reduction of the arm and the reward estimate for the arm and use the observed reward to update an observation history.
-
公开(公告)号:US11669768B2
公开(公告)日:2023-06-06
申请号:US16584082
申请日:2019-09-26
Applicant: Adobe Inc.
Inventor: Georgios Theocharous , Zheng Wen , Yasin Abbasi Yadkori , Qingyun Wu
CPC classification number: G06F18/2193 , G06N5/04 , G06N20/00
Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing offline models to warm start online bandit learner models. For example, the disclosed system can determine relevant offline models for an environment based on reward estimate differences between the offline models and the online model. The disclosed system can then utilize the relevant offline models (if any) to select an arm for the environment. The disclosed system can update the online model based on observed rewards for the selected arm. Additionally, the disclosed system can also use entropy reduction of arms to determine the utility of the arms in differentiating relevant and irrelevant offline models. For example, the disclosed system can select an arm based on a combination of the entropy reduction of the arm and the reward estimate for the arm and use the observed reward to update an observation history.
-
7.
公开(公告)号:US20190311394A1
公开(公告)日:2019-10-10
申请号:US15944980
申请日:2018-04-04
Applicant: Adobe Inc.
Inventor: Branislav Kveton , Zheng Wen , Yasin Abbasi Yadkori , Mohammad Ghavamzadeh , Claire Vernade
IPC: G06Q30/02
Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for determining parameters for digital campaign content in connection with executing digital campaigns using a rank-one assumption and a best-arm identification algorithm. For example, the disclosed system alternately explores response data in the first dimension and response data in the second dimension using the rank-one assumption and the best-arm identification algorithm to estimate highest sampling values from each dimension. In one or more embodiments, the disclosed system uses the estimated highest sampling values from the first and second dimension to determine a combination with a highest sampling value in a parameter matrix constructed based on the first dimension and the second dimension, and then executes the digital campaign using the determined combination.
-
公开(公告)号:US11593860B2
公开(公告)日:2023-02-28
申请号:US16880168
申请日:2020-05-21
Applicant: ADOBE INC.
Inventor: Shuai Li , Zheng Wen , Yasin Abbasi Yadkori , Vishwa Vinay , Branislav Kveton
IPC: G06Q30/00 , G06Q30/0601 , G06Q30/0251 , G06N20/00
Abstract: The present disclosure is directed toward systems, methods, and computer readable media for training and utilizing an item-level importance sampling model to evaluate and execute digital content selection policies. For example, systems described herein include training and utilizing an item-level importance sampling model that accurately and efficiently predicts a performance value that indicates a probability that a target user will interact with ranked lists of digital content items provided in accordance with a target digital content selection policy. Specifically, systems described herein can perform an offline evaluation of a target policy in light of historical user interactions corresponding to a training digital content selection policy to determine item-level importance weights that account for differences in digital content item distributions between the training policy and the target policy. In addition, the systems described herein can apply the item-level importance weights to training data to train item-level importance sampling model.
-
9.
公开(公告)号:US11551256B2
公开(公告)日:2023-01-10
申请号:US17334237
申请日:2021-05-28
Applicant: Adobe Inc.
Inventor: Branislav Kveton , Zheng Wen , Yasin Abbasi Yadkori , Mohammad Ghavamzadeh , Claire Vernade
IPC: G06Q30/02
Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for determining parameters for digital campaign content in connection with executing digital campaigns using a rank-one assumption and a best-arm identification algorithm. For example, the disclosed system alternately explores response data in the first dimension and response data in the second dimension using the rank-one assumption and the best-arm identification algorithm to estimate highest sampling values from each dimension. In one or more embodiments, the disclosed system uses the estimated highest sampling values from the first and second dimension to determine a combination with a highest sampling value in a parameter matrix constructed based on the first dimension and the second dimension, and then executes the digital campaign using the determined combination.
-
10.
公开(公告)号:US11100559B2
公开(公告)日:2021-08-24
申请号:US15940736
申请日:2018-03-29
Applicant: Adobe Inc.
Inventor: Matteo Sesia , Yasin Abbasi Yadkori
Abstract: Recommendation systems and techniques are described that use linear stochastic bandits and confidence interval generation to generate recommendations for digital content. These techniques overcome the limitations of conventional recommendations systems that are limited to a fixed parameter to estimate noise and thus do not fully exploit available data and are overly conservative, at a significant cost in operational performance of a computing device. To do so, a linear model, noise estimate, and confidence interval are refined by a recommendation system based on user interaction data that describes a result of user interaction with items of digital content. This is performed by comparing a result of the recommendation on user interaction with digital content with an estimate of a result of the recommendation.
-
-
-
-
-
-
-
-
-