- 专利标题: RETRIEVAL AUGMENTED REINFORCEMENT LEARNING
-
申请号: US18698890申请日: 2022-10-05
-
公开(公告)号: US20240320506A1公开(公告)日: 2024-09-26
- 发明人: Anirudh Goyal , Andrea Banino , Abram Luke Friesen , Theophane Guillaume Weber , Adrià Puigdomènech Badia , Nan Ke , Simon Osindero , Timothy Paul Lillicrap , Charles Blundell
- 申请人: DeepMind Technologies Limited
- 申请人地址: GB London
- 专利权人: DeepMind Technologies Limited
- 当前专利权人: DeepMind Technologies Limited
- 当前专利权人地址: GB London
- 国际申请: PCT/EP2022/077696 2022.10.05
- 进入国家日期: 2024-04-05
- 主分类号: G06N3/092
- IPC分类号: G06N3/092 ; G06N3/044 ; G06N3/0455 ; G06N3/084
摘要:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling a reinforcement learning agent in an environment to perform a task using a retrieval-augmented action selection process. One of the methods includes receiving a current observation characterizing a current state of the environment; processing an encoder network input comprising the current observation to determine a policy neural network hidden state that corresponds to the current observation; maintaining a plurality of trajectories generated as a result of the reinforcement learning agent interacting with the environment; selecting one or more trajectories from the plurality of trajectories; updating the policy neural network hidden state using update data determined from the one or more selected trajectories; and processing the updated hidden state using a policy neural network to generate a policy output that specifies an action to be performed by the agent in response to the current observation.
信息查询