-
公开(公告)号:US20210383222A1
公开(公告)日:2021-12-09
申请号:US17337820
申请日:2021-06-03
Applicant: DeepMind Technologies Limited
Inventor: David William Saxton , Eshaan Nichani
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network by estimating the objective function curvature based on current and previous gradients. In one aspect, a method comprises: sampling a batch of training data; and for each neural network parameter: determining, based on the current batch of training data, a respective current gradient of the objective function at the current iteration with respect to the current neural network parameter; estimating an objective function curvature with respect to the current neural network parameter based on (i) the current gradient of the objective function at the current iteration, and (ii) a respective previous gradient of the objective function at each of a plurality of previous iterations; and updating a current value of the neural network parameter based on the estimate of the curvature of the objective function.