-
公开(公告)号:WO2022175337A1
公开(公告)日:2022-08-25
申请号:PCT/EP2022/053834
申请日:2022-02-16
Applicant: DEEPMIND TECHNOLOGIES LIMITED
Inventor: RAVURI, Suman , LENC, Karel , MIROWSKI, Piotr Wojciech , LAM, Remi Roger Alain Paul , WILLSON, Matthew James , BROCK, Andrew
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for precipitation nowcasting using generative neural networks. One of the methods includes obtaining a context temporal sequence of a plurality of context radar fields characterizing a real-world location, each context radar field characterizing the weather in the real-world location at a corresponding preceding time point; sampling a set of one or more latent inputs by sampling values from a specified distribution; and for each sampled latent input, processing the context temporal sequence of radar fields and the sampled latent input using a generative neural network that has been configured through training to process the temporal sequence of radar fields to generate as output a predicted temporal sequence comprising a plurality of predicted radar fields, each predicted radar field in the predicted temporal sequence characterizing the predicted weather in the real-world location at a corresponding future time point.
-
公开(公告)号:WO2018083672A1
公开(公告)日:2018-05-11
申请号:PCT/IB2017/056907
申请日:2017-11-04
Applicant: DEEPMIND TECHNOLOGIES LIMITED
Inventor: VIOLA, Fabio , MIROWSKI, Piotr Wojciech , BANINO, Andrea , PASCANU, Razvan , SOYER, Hubert Josef , BALLARD, Andrew James , KUMARAN, Sudarshan , HADSELL, Raia Thais , SIFRE, Laurent , GOROSHIN, Rostislav , KAVUKCUOGLU, Koray , DENIL, Misha Man Ray
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a reinforcement learning system. In one aspect, a method of training an action selection policy neural network for use in selecting actions to be performed by an agent navigating through an environment to accomplish one or more goals comprises: receiving an observation image characterizing a current state of the environment; processing, using the action selection policy neural network, an input comprising the observation image to generate an action selection output; processing, using a geometry-prediction neural network, an intermediate output generated by the action selection policy neural network to predict a value of a feature of a geometry of the environment when in the current state; and backpropagating a gradient of a geometry-based auxiliary loss into the action selection policy neural network to determine a geometry-based auxiliary update for current values of the network parameters.
Abstract translation: 包括编码在计算机存储介质上的用于训练强化学习系统的计算机程序的方法,系统和装置。 在一个方面,一种训练动作选择策略神经网络的方法用于选择要通过在环境中导航以实现一个或多个目标的代理执行的动作,包括:接收表征环境的当前状态的观察图像; 使用动作选择策略神经网络处理包括观察图像的输入以生成动作选择输出; 使用几何预测神经网络处理由动作选择策略神经网络产生的中间输出以预测当处于当前状态时环境的几何特征的值; 以及将基于几何的辅助损失的梯度反向传播到动作选择策略神经网络中以确定针对网络参数的当前值的基于几何的辅助更新。 p>
-