Invention Publication
- Patent Title: LEARNING DIVERSE SKILLS FOR TASKS USING SEQUENTIAL LATENT VARIABLES FOR ENVIRONMENT DYNAMICS
-
Application No.: US18285519Application Date: 2022-05-27
-
Publication No.: US20240185083A1Publication Date: 2024-06-06
- Inventor: Steven Stenberg Hansen , Guillaume Desjardins
- Applicant: DeepMind Technologies Limited
- Applicant Address: GB London
- Assignee: DeepMind Technologies Limited
- Current Assignee: DeepMind Technologies Limited
- Current Assignee Address: GB London
- International Application: PCT/EP2022/064491 2022.05.27
- Date entered country: 2023-10-04
- Main IPC: G06N3/092
- IPC: G06N3/092

Abstract:
This specification relates to methods for controlling agents to perform actions according to a goal (or option) comprising a sequence of local goals (or local options) and corresponding methods for training. As discussed herein, environment dynamics may be modelled sequentially by sampling latent variables, each latent variable relating to a local goal and being dependent on a previous latent variable. These latent variables are used to condition an action-selection policy neural network to select actions according to the local goal. This allows the agents to reach more diverse states than would be possible through a fixed latent variable or goal, thereby encouraging exploratory behavior. In addition, specific methods described herein model the sequence of latent variables through a simple linear and recurrent relationship that allows the system to be trained more efficiently. This avoids the need to learn a state-dependent higher level policy for selecting the latent variables which can be difficult to train in practice.
Information query