- Patent Title: Training action selection neural networks using look-ahead search
-
Application No.: US16617478Application Date: 2018-05-28
-
Publication No.: US11449750B2Publication Date: 2022-09-20
- Inventor: Karen Simonyan , David Silver , Julian Schrittwieser
- Applicant: DEEPMIND TECHNOLOGIES LIMITED
- Applicant Address: GB London
- Assignee: DEEPMIND TECHNOLOGIES LIMITED
- Current Assignee: DEEPMIND TECHNOLOGIES LIMITED
- Current Assignee Address: GB London
- Agency: Fish & Richardson P.C.
- International Application: PCT/EP2018/063869 WO 20180528
- International Announcement: WO2018/215665 WO 20181129
- Main IPC: G06N3/08
- IPC: G06N3/08 ; G06N7/00

Abstract:
Methods, systems and apparatus, including computer programs encoded on computer storage media, for training an action selection neural network. One of the methods includes receiving an observation characterizing a current state of the environment; determining a target network output for the observation by performing a look ahead search of possible future states of the environment starting from the current state until the environment reaches a possible future state that satisfies one or more termination criteria, wherein the look ahead search is guided by the neural network in accordance with current values of the network parameters; selecting an action to be performed by the agent in response to the observation using the target network output generated by performing the look ahead search; and storing, in an exploration history data store, the target network output in association with the observation for use in updating the current values of the network parameters.
Information query