Distributed training of reinforcement learning systems

Invention Grant

US10445641B2 Distributed training of reinforcement learning systems 有权

Please log in to see more content

Patent Title: Distributed training of reinforcement learning systems
Application No.: US15016173

Application Date: 2016-02-04
Publication No.: US10445641B2

Publication Date: 2019-10-15
Inventor: Praveen Deepak Srinivasan , Rory Fearon , Cagdas Alcicek , Arun Sarath Nair , Samuel Blackwell , Vedavyas Panneershelvam , Alessandro De Maria , Volodymyr Mnih , Koray Kavukcuoglu , David Silver , Mustafa Suleyman
Applicant: DeepMind Technologies Limited
Applicant Address: GB London
Assignee: Deepmind Technologies Limited
Current Assignee: Deepmind Technologies Limited
Current Assignee Address: GB London
Agency: Fish & Richardson P.C.
Main IPC: G06N3/08
IPC: G06N3/08 ; G06N3/04

Distributed training of reinforcement learning systems

Abstract:

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributed training of reinforcement learning systems. One of the methods includes receiving, by a learner, current values of the parameters of the Q network from a parameter server, wherein each learner maintains a respective learner Q network replica and a respective target Q network replica; updating, by the learner, the parameters of the learner Q network replica maintained by the learner using the current values; selecting, by the learner, an experience tuple from a respective replay memory; computing, by the learner, a gradient from the experience tuple using the learner Q network replica maintained by the learner and the target Q network replica maintained by the learner; and providing, by the learner, the computed gradient to the parameter server.

Public/Granted literature

US20160232445A1 DISTRIBUTED TRAINING OF REINFORCEMENT LEARNING SYSTEMS Public/Granted day:2016-08-11

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/08	..学习方法