-
公开(公告)号:US20240221772A1
公开(公告)日:2024-07-04
申请号:US18609362
申请日:2024-03-19
Applicant: Google LLC
Inventor: Ehsan Amid , Om Dipakbhai Thakkar , Rajiv Mathews , Francoise Beaufays
IPC: G10L21/0332 , G10L15/06 , G10L15/08 , G10L21/10
CPC classification number: G10L21/0332 , G10L15/063 , G10L15/08 , G10L21/10
Abstract: A method of phrase extraction for ASR models includes obtaining audio data characterizing an utterance and a corresponding ground-truth transcription of the utterance and modifying the audio data to obfuscate a particular phrase recited in the utterance. The method also includes processing, using a trained ASR model, the modified audio data to generate a predicted transcription of the utterance, and determining whether the predicted transcription includes the particular phrase by comparing the predicted transcription of the utterance to the ground-truth transcription of the utterance. When the predicted transcription includes the particular phrase, the method includes generating an output indicating that the trained ASR model leaked the particular phrase from a training data set used to train the ASR model.
-
32.
公开(公告)号:US20240095582A1
公开(公告)日:2024-03-21
申请号:US18075757
申请日:2022-12-06
Applicant: GOOGLE LLC
Inventor: Andrew Hard , Sean Augenstein , Rohan Anil , Rajiv Mathews , Lara McConnaughey , Ehsan Amid , Antonious Girgis
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: During a round of decentralized learning for updating of a global machine learning (ML) model, remote processor(s) of a remote system may transmit, to a population of computing devices, primary weights for a primary version of the global ML model, and cause each of the computing devices to generate a corresponding update for the primary version of the global ML model. Further, the remote processor(s) may cause the primary version of the global ML model to be updated based on the corresponding updates that are received during the round of decentralized learning. However, the remote processor(s) may receive other corresponding updates subsequent to the round of decentralized learning. Accordingly, various techniques described herein (e.g., FARe-DUST, FeAST on MSG, and/or other techniques) enable the other corresponding updates to be utilized in achieving a final version of the global ML model.
-
公开(公告)号:US20240070530A1
公开(公告)日:2024-02-29
申请号:US18074729
申请日:2022-12-05
Applicant: GOOGLE LLC
Inventor: Ehsan Amid , Rajiv Mathews , Rohan Anil , Shankar Kumar , Jared Lichtarge
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: Implementations disclosed herein are directed to a hybrid federated learning (FL) technique that utilizes both federated averaging (FA) and federated distillation (FD) during a given round of FL of a given global machine learning (ML) model. Implementations may identify a population of client devices to participate in the given round of FL, determine a corresponding quantity of instances of client data available at each of the client devices that may be utilized during the given round of FL, and select different subsets of the client devices based on the corresponding quantity of instances of client data. Further, implementations may cause a first subset of the client devices to generate a corresponding FA update and a second subset of client devices to generate a corresponding FD update. Moreover, implementations may subsequently update the given global ML model based on the corresponding FA updates and the corresponding FD updates.
-
公开(公告)号:US11749261B2
公开(公告)日:2023-09-05
申请号:US17197954
申请日:2021-03-10
Applicant: Google LLC
Inventor: Françoise Beaufays , Andrew Hard , Swaroop Indra Ramaswamy , Om Dipakbhai Thakkar , Rajiv Mathews
IPC: G10L15/065 , G10L13/04 , G10L15/26 , G10L15/30
CPC classification number: G10L15/065 , G10L13/04 , G10L15/26 , G10L15/30
Abstract: Implementations disclosed herein are directed to federated learning of machine learning (“ML”) model(s) based on gradient(s) generated at corresponding client devices and a remote system. Processor(s) of the corresponding client devices can process client data generated locally at the corresponding client devices using corresponding on-device ML model(s) to generate corresponding predicted outputs, generate corresponding client gradients based on the corresponding predicted outputs, and transmit the corresponding client gradients to the remote system. Processor(s) of the remote system can process remote data obtained from remote database(s) using global ML model(s) to generate additional corresponding predicted outputs, generate corresponding remote gradients based on the additional corresponding predicted outputs. Further, the remote system can utilize the corresponding client gradients and the corresponding remote gradients to update the global ML model(s) or weights thereof. The updated global ML model(s) and/or the updated weights thereof can be transmitted back to the corresponding client devices.
-
公开(公告)号:US20230178094A1
公开(公告)日:2023-06-08
申请号:US17643848
申请日:2021-12-13
Applicant: Google LLC
Inventor: Ehsan Amid , Om Thakkar , Rajiv Mathews , Francoise Beaufays
IPC: G10L21/0332 , G10L21/10 , G10L15/06 , G10L15/08
CPC classification number: G10L21/0332 , G10L21/10 , G10L15/063 , G10L15/08
Abstract: A method of phrase extraction for ASR models includes obtaining audio data characterizing an utterance and a corresponding ground-truth transcription of the utterance and modifying the audio data to obfuscate a particular phrase recited in the utterance. The method also includes processing, using a trained ASR model, the modified audio data to generate a predicted transcription of the utterance, and determining whether the predicted transcription includes the particular phrase by comparing the predicted transcription of the utterance to the ground-truth transcription of the utterance. When the predicted transcription includes the particular phrase, the method includes generating an output indicating that the trained ASR model leaked the particular phrase from a training data set used to train the ASR model.
-
-
-
-