-
公开(公告)号:US20210089887A1
公开(公告)日:2021-03-25
申请号:US16832934
申请日:2020-03-27
Applicant: Apple Inc.
Inventor: Tyler B. Johnson , Carlos E. Guestrin , Pulkit Agrawal , Haijie Gu
IPC: G06N3/08
Abstract: A method includes determining a training scale for training a machine-learning model, defining a group of worker nodes having a number of worker nodes that is selected according to the training scale, and determining an average gradient of a loss function during a training iteration using the group of worker nodes. The method also includes determining a variance value for the average gradient of the loss function, determining a gain ratio based on the variance value for the average gradient of the loss function, and determining a learning rate parameter based on a learning rate schedule and the gain ratio. The method also includes determining updated parameters for the machine-learning model using the learning rate parameter and the average gradient of the loss function.