-
公开(公告)号:US20240070530A1
公开(公告)日:2024-02-29
申请号:US18074729
申请日:2022-12-05
Applicant: GOOGLE LLC
Inventor: Ehsan Amid , Rajiv Mathews , Rohan Anil , Shankar Kumar , Jared Lichtarge
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: Implementations disclosed herein are directed to a hybrid federated learning (FL) technique that utilizes both federated averaging (FA) and federated distillation (FD) during a given round of FL of a given global machine learning (ML) model. Implementations may identify a population of client devices to participate in the given round of FL, determine a corresponding quantity of instances of client data available at each of the client devices that may be utilized during the given round of FL, and select different subsets of the client devices based on the corresponding quantity of instances of client data. Further, implementations may cause a first subset of the client devices to generate a corresponding FA update and a second subset of client devices to generate a corresponding FD update. Moreover, implementations may subsequently update the given global ML model based on the corresponding FA updates and the corresponding FD updates.
-
2.
公开(公告)号:US20240194192A1
公开(公告)日:2024-06-13
申请号:US18078782
申请日:2022-12-09
Applicant: GOOGLE LLC
Inventor: Ehsan Amid , Rajiv Mathews , Shankar Kumar , Jared Lichtarge , Mingqing Chen , Tien-Ju Yang , Yuxin Ding
CPC classification number: G10L15/16 , G10L15/063
Abstract: Information can be distilled from a global automatic speech recognition (ASR) model to a client ASR model. Many implementations include using an RNN-T model as the ASR model, where the global ASR model includes a global encoder, a joint network, a prediction network, and where the client ASR model includes a client encoder, the joint network, and the prediction network. Various implementations include using principal component analysis (PCA) while training the global ASR model to learn a mean vector and a set of principal components corresponding to the global ASR model. Additional or alternative implementations include training the client ASR model to generate one or more predicted coefficients of the global ASR model.
-
公开(公告)号:US20240233707A9
公开(公告)日:2024-07-11
申请号:US18488578
申请日:2023-10-17
Applicant: Google LLC
Inventor: Tien-Ju Yang , You-Chi Cheng , Shankar Kumar , Jared Lichtarge , Ehsan Amid , Yuxin Ding , Rajiv Mathews , Mingqing Chen
IPC: G10L15/06 , G10L15/197 , G10L15/30
CPC classification number: G10L15/063 , G10L15/197 , G10L15/30 , G10L2015/0635
Abstract: A method includes receiving distillation data including a plurality of out-of-domain training utterances. For each particular out-of-domain training utterance of the distillation data, the method includes generating a corresponding augmented out-of-domain training utterance, and generating, using a teacher ASR model trained on training data corresponding to a target domain, a pseudo-label corresponding to the corresponding augmented out-of-domain training utterance. The method also includes distilling a student ASR model from the teacher ASR model by training the student ASR model using the corresponding augmented out-of-domain training utterances paired with the corresponding pseudo-labels generated by the teacher ASR model.
-
公开(公告)号:US20240135918A1
公开(公告)日:2024-04-25
申请号:US18488578
申请日:2023-10-16
Applicant: Google LLC
Inventor: Tien-Ju Yang , You-Chi Cheng , Shankar Kumar , Jared Lichtarge , Ehsan Amid , Yuxin Ding , Rajiv Mathews , Mingqing Chen
IPC: G10L15/06 , G10L15/197 , G10L15/30
CPC classification number: G10L15/063 , G10L15/197 , G10L15/30 , G10L2015/0635
Abstract: A method includes receiving distillation data including a plurality of out-of-domain training utterances. For each particular out-of-domain training utterance of the distillation data, the method includes generating a corresponding augmented out-of-domain training utterance, and generating, using a teacher ASR model trained on training data corresponding to a target domain, a pseudo-label corresponding to the corresponding augmented out-of-domain training utterance. The method also includes distilling a student ASR model from the teacher ASR model by training the student ASR model using the corresponding augmented out-of-domain training utterances paired with the corresponding pseudo-labels generated by the teacher ASR model.
-
-
-