Invention Grant
- Patent Title: Utilizing relevant offline models to warm start an online bandit learner model
-
Application No.: US16584082Application Date: 2019-09-26
-
Publication No.: US11669768B2Publication Date: 2023-06-06
- Inventor: Georgios Theocharous , Zheng Wen , Yasin Abbasi Yadkori , Qingyun Wu
- Applicant: Adobe Inc.
- Applicant Address: US CA San Jose
- Assignee: Adobe Inc.
- Current Assignee: Adobe Inc.
- Current Assignee Address: US CA San Jose
- Agency: Keller Preece PLLC
- Main IPC: G06F21/00
- IPC: G06F21/00 ; G06F18/21 ; G06N5/04 ; G06N20/00

Abstract:
Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing offline models to warm start online bandit learner models. For example, the disclosed system can determine relevant offline models for an environment based on reward estimate differences between the offline models and the online model. The disclosed system can then utilize the relevant offline models (if any) to select an arm for the environment. The disclosed system can update the online model based on observed rewards for the selected arm. Additionally, the disclosed system can also use entropy reduction of arms to determine the utility of the arms in differentiating relevant and irrelevant offline models. For example, the disclosed system can select an arm based on a combination of the entropy reduction of the arm and the reward estimate for the arm and use the observed reward to update an observation history.
Public/Granted literature
- US20210097350A1 UTILIZING RELEVANT OFFLINE MODELS TO WARM START AN ONLINE BANDIT LEARNER MODEL Public/Granted day:2021-04-01
Information query