Patent search ap:("Adobe Inc.") AND inv:"Qingyun Wu" Page 1

1.

发明公开
WARM STARTING AN ONLINE BANDIT LEARNER MODEL UTILIZING RELEVANT OFFLINE MODELS 审中-公开

公开(公告)号：US20230259829A1

公开(公告)日：2023-08-17

申请号：US18306449

申请日：2023-04-25

Applicant: Adobe Inc.

Inventor： Georgios Theocharous , Zheng Wen , Yasin Abbasi Yadkori , Qingyun Wu

IPC: G06N20/00 , G06N5/04 , G06F18/21

CPC classification number: G06N20/00 , G06N5/04 , G06F18/2193

Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing offline models to warm start online bandit learner models. For example, the disclosed system can determine relevant offline models for an environment based on reward estimate differences between the offline models and the online model. The disclosed system can then utilize the relevant offline models (if any) to select an arm for the environment. The disclosed system can update the online model based on observed rewards for the selected arm. Additionally, the disclosed system can also use entropy reduction of arms to determine the utility of the arms in differentiating relevant and irrelevant offline models. For example, the disclosed system can select an arm based on a combination of the entropy reduction of the arm and the reward estimate for the arm and use the observed reward to update an observation history.

2.

发明授权
Utilizing relevant offline models to warm start an online bandit learner model 有权

公开(公告)号：US11669768B2

公开(公告)日：2023-06-06

申请号：US16584082

申请日：2019-09-26

Applicant: Adobe Inc.

Inventor： Georgios Theocharous , Zheng Wen , Yasin Abbasi Yadkori , Qingyun Wu

IPC: G06F21/00 , G06F18/21 , G06N5/04 , G06N20/00

CPC classification number: G06F18/2193 , G06N5/04 , G06N20/00

Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing offline models to warm start online bandit learner models. For example, the disclosed system can determine relevant offline models for an environment based on reward estimate differences between the offline models and the online model. The disclosed system can then utilize the relevant offline models (if any) to select an arm for the environment. The disclosed system can update the online model based on observed rewards for the selected arm. Additionally, the disclosed system can also use entropy reduction of arms to determine the utility of the arms in differentiating relevant and irrelevant offline models. For example, the disclosed system can select an arm based on a combination of the entropy reduction of the arm and the reward estimate for the arm and use the observed reward to update an observation history.

3.

发明申请
UTILIZING RELEVANT OFFLINE MODELS TO WARM START AN ONLINE BANDIT LEARNER MODEL 有权

公开(公告)号：US20210097350A1

公开(公告)日：2021-04-01

申请号：US16584082

申请日：2019-09-26

Applicant: Adobe Inc.

Inventor： Georgios Theocharous , Zheng Wen , Yasin Abbasi Yadkori , Qingyun Wu

IPC: G06K9/62 , G06N20/00 , G06N5/04

Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing offline models to warm start online bandit learner models. For example, the disclosed system can determine relevant offline models for an environment based on reward estimate differences between the offline models and the online model. The disclosed system can then utilize the relevant offline models (if any) to select an arm for the environment. The disclosed system can update the online model based on observed rewards for the selected arm. Additionally, the disclosed system can also use entropy reduction of arms to determine the utility of the arms in differentiating relevant and irrelevant offline models. For example, the disclosed system can select an arm based on a combination of the entropy reduction of the arm and the reward estimate for the arm and use the observed reward to update an observation history.

Patent Agency Ranking