-
公开(公告)号:US20240386318A1
公开(公告)日:2024-11-21
申请号:US18386431
申请日:2023-11-02
Applicant: GOOGLE LLC
Inventor: Yuxin Ding , Lillian Zhou , Mingqing Chen , Rajiv Mathews , Andrew Hard , Sean Augenstein
IPC: G06N20/00
Abstract: Implementations described herein are directed to techniques for mitigating and/or eliminating catastrophic forgetting of a global machine learning (ML) model during decentralized learning thereof. Remote processor(s) of a remote system can initially train a global ML model based on server data that is accessible by the remote system. In subsequent decentralized learning of the global ML model, the remote processor(s) can utilize various checkpoint averaging techniques. As described herein, these various checkpoint averaging techniques can include, but are not limited to, a static checkpoint averaging technique, a dynamic checkpoint averaging techniques, and/or a mixed centralized and decentralized training technique.
-
2.
公开(公告)号:US20230359907A1
公开(公告)日:2023-11-09
申请号:US17848947
申请日:2022-07-01
Applicant: GOOGLE LLC
Inventor: Sean Augenstein , Andrew Hard , Kurt Partridge , Rajiv Mathews , Lin Ning , Karan Singhal
IPC: G06N5/02
CPC classification number: G06N5/022
Abstract: Implementations disclosed herein are directed to various techniques for mitigating and/or preventing catastrophic forgetting in federated learning of global machine learning (ML) models. Implementations may identify a global ML model that is initially trained at a remote server based on a server data set, determine server-based data for global weight(s) of the global ML model, and transmit the global ML model and the server-based data to a plurality of client devices. The server-based data may include, for example, EWC loss term(s), client augmenting gradients, server augmenting gradients, and/or server-based data. Further, the plurality client devices may generate, based on processing corresponding predicted output and using the global ML model, and based on the server-based data, a corresponding client gradient, and transmit the corresponding client gradient to the remote server. Implementations may further generate an updated global ML model based on at least the corresponding client gradients.
-
公开(公告)号:US20230351246A1
公开(公告)日:2023-11-02
申请号:US17734766
申请日:2022-05-02
Applicant: GOOGLE LLC
Inventor: Andrew Hard , Kurt Partridge , Rajiv Mathews , Sean Augenstein
Abstract: Implementations disclosed herein are directed to utilizing elastic weight consolidation (EWC) loss term(s) in federated learning of global machine learning (ML) models. Implementations may identify a global ML model that initially trained at a remote server based on a server data set, determine the EWC loss term(s) for global weight(s) of the global ML model, and transmit the global ML model and the EWC loss term(s) to a plurality of client devices. The EWC loss term(s) may be determined based on a Fisher information matrix for the server data set. Further, the plurality client devices may generate, based on processing corresponding predicted output and using the global ML model, and based on the EWC loss term(s), a corresponding client gradient, and transmit the corresponding client gradient to the remote server. Implementations may further generate an updated global ML model based on at least the corresponding client gradients.
-
公开(公告)号:US20250045627A1
公开(公告)日:2025-02-06
申请号:US18365487
申请日:2023-08-04
Applicant: GOOGLE LLC
Inventor: Andrew Hard , Kurt Partridge , Sean Augenstein , Rajiv Mathews
IPC: G06N20/00
Abstract: Processor(s) of a client device can receive global weights of a global ML model from a remote system, obtain a client device data set, determine a Fisher information matrix for the client data set, and transmit the Fisher information matrix for the client data set to the remote system. Further, processor(s) of the remote system can determine a corresponding elastic weight consolidation (EWC) loss term for each of the global weights based on at least the Fisher information matrix, generate a server update for the global ML model based on (i) processing server data remotely at the remote system and using the global ML model and (ii) based on the corresponding EWC loss term for each of the global weights, and update the global weights of the global ML model based on the server update.
-
公开(公告)号:US20240095582A1
公开(公告)日:2024-03-21
申请号:US18075757
申请日:2022-12-06
Applicant: GOOGLE LLC
Inventor: Andrew Hard , Sean Augenstein , Rohan Anil , Rajiv Mathews , Lara McConnaughey , Ehsan Amid , Antonious Girgis
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: During a round of decentralized learning for updating of a global machine learning (ML) model, remote processor(s) of a remote system may transmit, to a population of computing devices, primary weights for a primary version of the global ML model, and cause each of the computing devices to generate a corresponding update for the primary version of the global ML model. Further, the remote processor(s) may cause the primary version of the global ML model to be updated based on the corresponding updates that are received during the round of decentralized learning. However, the remote processor(s) may receive other corresponding updates subsequent to the round of decentralized learning. Accordingly, various techniques described herein (e.g., FARe-DUST, FeAST on MSG, and/or other techniques) enable the other corresponding updates to be utilized in achieving a final version of the global ML model.
-
-
-
-