Invention Application
- Patent Title: TRAINING ACTION SELECTION NEURAL NETWORKS USING Q-LEARNING COMBINED WITH LOOK AHEAD SEARCH
-
Application No.: US17763920Application Date: 2020-09-23
-
Publication No.: US20220366247A1Publication Date: 2022-11-17
- Inventor: Jessica Blake Chandler Hamrick , Victor Constant Bapst , Alvaro Sanchez , Tobias Pfaff , Theophane Guillaume Weber , Lars Buesing , Peter William Battaglia
- Applicant: DeepMind Technologies Limited
- Applicant Address: GB London
- Assignee: DeepMind Technologies Limited
- Current Assignee: DeepMind Technologies Limited
- Current Assignee Address: GB London
- International Application: PCT/EP2020/076597 WO 20200923
- Main IPC: G06N3/08
- IPC: G06N3/08 ; G06N3/04

Abstract:
A reinforcement learning system and method that selects actions to be performed by an agent interacting with an environment. The system uses a combination of reinforcement learning and a look ahead search: Reinforcement learning Q-values are used to guide the look ahead search and the search is used in turn to improve the Q-values. The system learns from a combination of real experience and simulated, model-based experience.
Information query