-
公开(公告)号:US20240330701A1
公开(公告)日:2024-10-03
申请号:US18577484
申请日:2022-07-27
Applicant: DeepMind Technologies Limited
Inventor: Maxwell Elliot Jaderberg , Wojciech Czarnecki
IPC: G06N3/092 , G06N3/0985
CPC classification number: G06N3/092 , G06N3/0985
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for raining an agent neural network for use in controlling an agent to perform a plurality of tasks. One of the methods includes maintaining population data specifying a population of one or more candidate agent neural networks; and training each candidate agent neural network on a respective set of one or more tasks to update the parameter values of the parameters of the candidate agent neural networks in the population data, the training comprising, for each candidate agent neural network: obtaining data identifying a candidate task; obtaining data specifying a control policy for the candidate task; determining whether to train the candidate agent neural network on the candidate task; and in response to determining to train the candidate agent neural network on the candidate task, training the candidate agent neural network on the candidate task.
-
2.
公开(公告)号:US12020164B2
公开(公告)日:2024-06-25
申请号:US17048023
申请日:2019-04-18
Applicant: DEEPMIND TECHNOLOGIES LIMITED
Inventor: Jonathan Schwarz , Razvan Pascanu , Raia Thais Hadsell , Wojciech Czarnecki , Yee Whye Teh , Jelena Luketina
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scalable continual learning using neural networks. One of the methods includes receiving new training data for a new machine learning task; training an active subnetwork on the new training data to determine trained values of the active network parameters from initial values of the active network parameters while holding current values of the knowledge parameters fixed; and training a knowledge subnetwork on the new training data to determine updated values of the knowledge parameters from the current values of the knowledge parameters by training the knowledge subnetwork to generate knowledge outputs for the new training inputs that match active outputs generated by the trained active subnetwork for the new training inputs.
-
公开(公告)号:US11842281B2
公开(公告)日:2023-12-12
申请号:US17183618
申请日:2021-02-24
Applicant: DeepMind Technologies Limited
Inventor: Volodymyr Mnih , Wojciech Czarnecki , Maxwell Elliot Jaderberg , Tom Schaul , David Silver , Koray Kavukcuoglu
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. The method includes: training an action selection policy neural network, and during the training of the action selection neural network, training one or more auxiliary control neural networks and a reward prediction neural network. Each of the auxiliary control neural networks is configured to receive a respective intermediate output generated by the action selection policy neural network and generate a policy output for a corresponding auxiliary control task. The reward prediction neural network is configured to receive one or more intermediate outputs generated by the action selection policy neural network and generate a corresponding predicted reward. Training each of the auxiliary control neural networks and the reward prediction neural network comprises adjusting values of the respective auxiliary control parameters, reward prediction parameters, and the action selection policy network parameters.
-
公开(公告)号:US20230281445A1
公开(公告)日:2023-09-07
申请号:US18120715
申请日:2023-03-13
Applicant: DeepMind Technologies Limited
Inventor: Maxwell Elliot Jaderberg , Wojciech Czarnecki , Timothy Frederick Goldie Green , Valentin Clement Dalibard
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. A method includes: training a neural network having a plurality of network parameters to perform a particular neural network task and to determine trained values of the network parameters using an iterative training process having a plurality of hyperparameters, the method comprising: maintaining a plurality of candidate neural networks and, for each of the candidate neural networks, data specifying: (i) respective values of the network parameters for the candidate neural network, (ii) respective values of the hyperparameters for the candidate neural network, and (iii) a quality measure that measures a performance of the candidate neural network on the particular neural network task; and for each of the plurality of candidate neural networks, repeatedly performing additional training operations.
-
公开(公告)号:US11715009B2
公开(公告)日:2023-08-01
申请号:US16303595
申请日:2017-05-19
Applicant: DEEPMIND TECHNOLOGIES LIMITED
Inventor: Oriol Vinyals , Alexander Benjamin Graves , Wojciech Czarnecki , Koray Kavukcuoglu , Simon Osindero , Maxwell Elliot Jaderberg
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network including a first subnetwork followed by a second subnetwork on training inputs by optimizing an objective function. In one aspect, a method includes processing a training input using the neural network to generate a training model output, including processing a subnetwork input for the training input using the first subnetwork to generate a subnetwork activation for the training input in accordance with current values of parameters of the first subnetwork, and providing the subnetwork activation as input to the second subnetwork; determining a synthetic gradient of the objective function for the first subnetwork by processing the subnetwork activation using a synthetic gradient model in accordance with current values of parameters of the synthetic gradient model; and updating the current values of the parameters of the first subnetwork using the synthetic gradient.
-
公开(公告)号:US20210004676A1
公开(公告)日:2021-01-07
申请号:US16766631
申请日:2018-11-22
Applicant: DeepMind Technologies Limited
Inventor: Maxwell Elliot Jaderberg , Wojciech Czarnecki , Timothy Frederick Goldie Green , Valentin Clement Dalibard
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. A method includes: training a neural network having a plurality of network parameters to perform a particular neural network task and to determine trained values of the network parameters using an iterative training process having a plurality of hyperparameters, the method comprising: maintaining a plurality of candidate neural networks and, for each of the candidate neural networks, data specifying: (i) respective values of the network parameters for the candidate neural network, (ii) respective values of the hyperparameters for the candidate neural network, and (iii) a quality measure that measures a performance of the candidate neural network on the particular neural network task; and for each of the plurality of candidate neural networks, repeatedly performing additional training operations.
-
公开(公告)号:US20240346310A1
公开(公告)日:2024-10-17
申请号:US18612917
申请日:2024-03-21
Applicant: DeepMind Technologies Limited
Inventor: Maxwell Elliot Jaderberg , Wojciech Czarnecki , Timothy Frederick Goldie Green , Valentin Clement Dalibard
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. A method includes: training a neural network having a plurality of network parameters to perform a particular neural network task and to determine trained values of the network parameters using an iterative training process having a plurality of hyperparameters, the method comprising: maintaining a plurality of candidate neural networks and, for each of the candidate neural networks, data specifying: (i) respective values of the network parameters for the candidate neural network, (ii) respective values of the hyperparameters for the candidate neural network, and (iii) a quality measure that measures a performance of the candidate neural network on the particular neural network task; and for each of the plurality of candidate neural networks, repeatedly performing additional training operations.
-
公开(公告)号:US11941527B2
公开(公告)日:2024-03-26
申请号:US18120715
申请日:2023-03-13
Applicant: DeepMind Technologies Limited
Inventor: Maxwell Elliot Jaderberg , Wojciech Czarnecki , Timothy Frederick Goldie Green , Valentin Clement Dalibard
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. A method includes: training a neural network having a plurality of network parameters to perform a particular neural network task and to determine trained values of the network parameters using an iterative training process having a plurality of hyperparameters, the method comprising: maintaining a plurality of candidate neural networks and, for each of the candidate neural networks, data specifying: (i) respective values of the network parameters for the candidate neural network, (ii) respective values of the hyperparameters for the candidate neural network, and (iii) a quality measure that measures a performance of the candidate neural network on the particular neural network task; and for each of the plurality of candidate neural networks, repeatedly performing additional training operations.
-
公开(公告)号:US11604985B2
公开(公告)日:2023-03-14
申请号:US16766631
申请日:2018-11-22
Applicant: DeepMind Technologies Limited
Inventor: Maxwell Elliot Jaderberg , Wojciech Czarnecki , Timothy Frederick Goldie Green , Valentin Clement Dalibard
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. A method includes: training a neural network having multiple network parameters to perform a particular neural network task and to determine trained values of the network parameters using an iterative training process having multiple hyperparameters, the method includes: maintaining multiple candidate neural networks and, for each of the multiple candidate neural networks, data specifying: (i) respective values of network parameters for the candidate neural network, (ii) respective values of hyperparameters for the candidate neural network, and (iii) a quality measure that measures a performance of the candidate neural network on the particular neural network task; and for each of the multiple candidate neural networks, repeatedly performing additional training operations.
-
公开(公告)号:US20200090048A1
公开(公告)日:2020-03-19
申请号:US16689020
申请日:2019-11-19
Applicant: DeepMind Technologies Limited
Inventor: Razvan Pascanu , Raia Thais Hadsell , Victor Constant Bapst , Wojciech Czarnecki , James Kirkpatrick , Yee Whye Teh , Nicolas Manfred Otto Heess
Abstract: A method is proposed for training a multitask computer system, such as a multitask neural network system. The system comprises a set of trainable workers and a shared module. The trainable workers and shared module are trained on a plurality of different tasks, such that each worker learns to perform a corresponding one of the tasks according to a respective task policy, and said shared policy network learns a multitask policy which represents common behavior for the tasks. The coordinated training is performed by optimizing an objective function comprising, for each task: a reward term indicative of an expected reward earned by a worker in performing the corresponding task according to the task policy; and at least one entropy term which regularizes the distribution of the task policy towards the distribution of the multitask policy.
-
-
-
-
-
-
-
-
-