-
公开(公告)号:US20210295201A1
公开(公告)日:2021-09-23
申请号:US16821509
申请日:2020-03-17
申请人: Google LLC
发明人: Seungyeon Kim , Jingzhao Zhang , Andreas Veit , Sanjiv Kumar , Sashank Reddi , Praneeth Karimireddy
摘要: Generally, the present disclosure is directed to systems and methods that perform adaptive optimization with improved convergence properties. The adaptive optimization techniques described herein are useful in various optimization scenarios, including, for example, training a machine-learned model such as, for example, a neural network. In particular, according to one aspect of the present disclosure, a system implementing the adaptive optimization technique can, over a plurality of iterations, employ an adaptive per coordinate clipping threshold to clip a current first moment of the coordinate to obtain a current update value that enables faster convergence for the machine-learned model when the noise in the stochastic gradients is heavy tailed.
-
公开(公告)号:US12001509B2
公开(公告)日:2024-06-04
申请号:US16821509
申请日:2020-03-17
申请人: Google LLC
发明人: Seungyeon Kim , Jingzhao Zhang , Andreas Veit , Sanjiv Kumar , Sashank Reddi , Praneeth Karimireddy
CPC分类号: G06F17/18 , G06F18/217 , G06N20/00 , G06N3/084
摘要: Generally, the present disclosure is directed to systems and methods that perform adaptive optimization with improved convergence properties. The adaptive optimization techniques described herein are useful in various optimization scenarios, including, for example, training a machine-learned model such as, for example, a neural network. In particular, according to one aspect of the present disclosure, a system implementing the adaptive optimization technique can, over a plurality of iterations, employ an adaptive per coordinate clipping threshold to clip a current first moment of the coordinate to obtain a current update value that enables faster convergence for the machine-learned model when the noise in the stochastic gradients is heavy tailed.
-