Invention Publication
- Patent Title: DISTRIBUTED TRAINING USING ACTOR-CRITIC REINFORCEMENT LEARNING WITH OFF-POLICY CORRECTION FACTORS
-
Application No.: US18149771Application Date: 2023-01-04
-
Publication No.: US20230153617A1Publication Date: 2023-05-18
- Inventor: Hubert Josef Soyer , Lasse Espeholt , Karen Simonyan , Yotam Doron , Vlad Firoiu , Volodymyr Mnih , Koray Kavukcuoglu , Remi Munos , Thomas Ward , Timothy James Alexander Harley , Iain Robert Dunning
- Applicant: DeepMind Technologies Limited
- Applicant Address: GB London
- Assignee: DeepMind Technologies Limited
- Current Assignee: DeepMind Technologies Limited
- Current Assignee Address: GB London
- Main IPC: G06N3/08
- IPC: G06N3/08 ; G06N3/045

Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.
Public/Granted literature
- US11868894B2 Distributed training using actor-critic reinforcement learning with off-policy correction factors Public/Granted day:2024-01-09
Information query