Invention Application
- Patent Title: TRAINING ACTION SELECTION NEURAL NETWORKS USING OFF-POLICY ACTOR CRITIC REINFORCEMENT LEARNING AND STOCHASTIC DUELING NEURAL NETWORKS
-
Application No.: US18962266Application Date: 2024-11-27
-
Publication No.: US20250094772A1Publication Date: 2025-03-20
- Inventor: Ziyu Wang , Nicolas Manfred Otto Heess , Victor Constant Bapst
- Applicant: DeepMind Technologies Limited
- Applicant Address: GB London
- Assignee: DeepMind Technologies Limited
- Current Assignee: DeepMind Technologies Limited
- Current Assignee Address: GB London
- Main IPC: G06N3/045
- IPC: G06N3/045 ; G06N3/006 ; G06N3/047 ; G06N3/084 ; G06N3/088

Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.
Information query