UTILIZING RELEVANT OFFLINE MODELS TO WARM START AN ONLINE BANDIT LEARNER MODEL

Invention Application

US20210097350A1 UTILIZING RELEVANT OFFLINE MODELS TO WARM START AN ONLINE BANDIT LEARNER MODEL 有权

Please log in to see more content

Patent Title: UTILIZING RELEVANT OFFLINE MODELS TO WARM START AN ONLINE BANDIT LEARNER MODEL
Application No.: US16584082

Application Date: 2019-09-26
Publication No.: US20210097350A1

Publication Date: 2021-04-01
Inventor: Georgios Theocharous , Zheng Wen , Yasin Abbasi Yadkori , Qingyun Wu
Applicant: Adobe Inc.
Applicant Address: US CA San Jose
Assignee: Adobe Inc.
Current Assignee: Adobe Inc.
Current Assignee Address: US CA San Jose
Main IPC: G06K9/62
IPC: G06K9/62 ; G06N20/00 ; G06N5/04

UTILIZING RELEVANT OFFLINE MODELS TO WARM START AN ONLINE BANDIT LEARNER MODEL

Abstract:

Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing offline models to warm start online bandit learner models. For example, the disclosed system can determine relevant offline models for an environment based on reward estimate differences between the offline models and the online model. The disclosed system can then utilize the relevant offline models (if any) to select an arm for the environment. The disclosed system can update the online model based on observed rewards for the selected arm. Additionally, the disclosed system can also use entropy reduction of arms to determine the utility of the arms in differentiating relevant and irrelevant offline models. For example, the disclosed system can select an arm based on a combination of the entropy reduction of the arm and the reward estimate for the arm and use the observed reward to update an observation history.

Public/Granted literature

US11669768B2 Utilizing relevant offline models to warm start an online bandit learner model Public/Granted day:2023-06-06

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06K	图形数据读取（图像或视频识别或理解G06V）；数据的呈现；记录载体；处理记录载体
G06K9/00	识别模式的方法或装置（图形读取或将机械参数模式（例如力或存在）转换为电信号的方法或装置 G06K11/00）（图像或视频识别或理解 G06V）（语音识别 G10L15/00 )
G06K9/62	.应用电子设备进行识别的方法或装置