Patent search ap:("DeepMind Technologies Limited") AND inv:"Iurii Kemaev" Page 1

1.

发明公开
LEARNING OPTIONS FOR ACTION SELECTION WITH META-GRADIENTS IN MULTI-TASK REINFORCEMENT LEARNING 审中-公开

公开(公告)号：US20230144995A1

公开(公告)日：2023-05-11

申请号：US17918365

申请日：2021-06-07

Applicant: DeepMind Technologies Limited

Inventor： Vivek Veeriah Jeya Veeraiah , Tom Ben Zion Zahavy , Matteo Hessel , Zhongwen Xu , Junhyuk Oh , Iurii Kemaev , Hado Philip van Hasselt , David Silver , Satinder Singh Baveja

IPC: G06N3/045 , G06N3/084

CPC classification number: G06N3/045 , G06N3/084

Abstract: A reinforcement learning system, method, and computer program code for controlling an agent to perform a plurality of tasks while interacting with an environment. The system learns options, where an option comprises a sequence of primitive actions performed by the agent under control of an option policy neural network. In implementations the system discovers options which are useful for multiple different tasks by meta-learning rewards for training the option policy neural network whilst the agent is interacting with the environment.

Patent Agency Ranking