Reinforcement learning using meta-learned intrinsic rewards

Invention Grant

US12293283B2 Reinforcement learning using meta-learned intrinsic rewards 有权

Please log in to see more content

Patent Title: Reinforcement learning using meta-learned intrinsic rewards
Application No.: US17033410

Application Date: 2020-09-25
Publication No.: US12293283B2

Publication Date: 2025-05-06
Inventor: Zeyu Zheng , Junhyuk Oh , Satinder Singh Baveja
Applicant: DeepMind Technologies Limited
Applicant Address: GB London
Assignee: DeepMind Technologies Limited
Current Assignee: DeepMind Technologies Limited
Current Assignee Address: GB London
Agency: Fish & Richardson P.C.
Main IPC: G06N3/08
IPC: G06N3/08 ; G06N3/04 ; G06N3/044 ; G06N3/045 ; G06N3/084

Reinforcement learning using meta-learned intrinsic rewards

Abstract:

There is described methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. The reinforcement learning system comprises an agent configured to perform actions based upon a policy and an intrinsic reward system configured to generate intrinsic reward values for the agent based upon the actions taken by the agent. The method comprises training the reinforcement learning system based upon a plurality of tasks. The training comprises updating the agent's policy based upon the intrinsic reward values generated by the intrinsic reward system and updating the intrinsic reward system based upon an extrinsic reward value obtained based upon the task being performed by the agent. The training further comprises re-initializing the agent's policy when an expiration criterion associated with the agent is met.

Public/Granted literature

US20210089910A1 REINFORCEMENT LEARNING USING META-LEARNED INTRINSIC REWARDS Public/Granted day:2021-03-25

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/08	..学习方法