-
1.
公开(公告)号:US20230144995A1
公开(公告)日:2023-05-11
申请号:US17918365
申请日:2021-06-07
Applicant: DeepMind Technologies Limited
Inventor: Vivek Veeriah Jeya Veeraiah , Tom Ben Zion Zahavy , Matteo Hessel , Zhongwen Xu , Junhyuk Oh , Iurii Kemaev , Hado Philip van Hasselt , David Silver , Satinder Singh Baveja
Abstract: A reinforcement learning system, method, and computer program code for controlling an agent to perform a plurality of tasks while interacting with an environment. The system learns options, where an option comprises a sequence of primitive actions performed by the agent under control of an option policy neural network. In implementations the system discovers options which are useful for multiple different tasks by meta-learning rewards for training the option policy neural network whilst the agent is interacting with the environment.