-
1.
公开(公告)号:US20240112673A1
公开(公告)日:2024-04-04
申请号:US17958887
申请日:2022-10-03
Applicant: GOOGLE LLC
Inventor: Rajiv Mathews , Rohit Prabhavalkar , Giovanni Motta , Mingqing Chen , Lillian Zhou , Dhruv Guliani , Harry Zhang , Trevor Strohman , Françoise Beaufays
IPC: G10L15/197 , G10L15/06 , G10L15/22 , G10L15/30
CPC classification number: G10L15/197 , G10L15/063 , G10L15/22 , G10L15/30 , G10L2015/0635
Abstract: Implementations described herein identify and correct automatic speech recognition (ASR) misrecognitions. For example, on-device processor(s) of a client device may generate a predicted textual segment that is predicted to correspond to spoken utterance of a user of the client device, and may receive further input that modifies the predicted textual segment to an alternate textual segment. Further, the on-device processor(s) may store these textual segments in on-device storage as a candidate correction pair, and transmit the candidate correction pair to a remote system. Moreover, remote processor(s) of the remote system may determine that the candidate correction pair is an actual correction pair, and may cause client devices to generate updates for a global ASR model for the candidate correction pair. Additionally, the remote processor(s) may distribute the global ASR model to the client devices and/or additional client devices.
-
公开(公告)号:US20240371362A1
公开(公告)日:2024-11-07
申请号:US18652587
申请日:2024-05-01
Applicant: GOOGLE LLC
Inventor: Tien-Ju Yang , Yonghui Xiao , Giovanni Motta , Françoise Beaufays , Rajiv Mathews , Mingqing Chen
IPC: G10L15/06
Abstract: Implementations are directed to efficient federated learning of machine learning (ML) model(s) through on-the-fly decompression and compression of model parameters, of the ML model(s), when facilitating forward propagation and/or back propagation at client device(s). For example, implementations can transmit, from a remote system to a client device, a compressed on-device ML model that includes some compressed parameters. Further, the client device can, in performing forward propagation and/or back propagation using the on-device ML model, decompress those compressed parameters on-the-fly as the parameters are needed for the propagation. The propagation will utilize the decompressed parameters that were decompressed on the fly. Further, after the decompressed parameters are utilized, they can be deallocated from memory (while their compressed counterparts optionally remain in memory) to enable allocation of memory for further decompressed parameters that will be needed next and/or needed for other ongoing process(es).
-
公开(公告)号:US20240233707A9
公开(公告)日:2024-07-11
申请号:US18488578
申请日:2023-10-17
Applicant: Google LLC
Inventor: Tien-Ju Yang , You-Chi Cheng , Shankar Kumar , Jared Lichtarge , Ehsan Amid , Yuxin Ding , Rajiv Mathews , Mingqing Chen
IPC: G10L15/06 , G10L15/197 , G10L15/30
CPC classification number: G10L15/063 , G10L15/197 , G10L15/30 , G10L2015/0635
Abstract: A method includes receiving distillation data including a plurality of out-of-domain training utterances. For each particular out-of-domain training utterance of the distillation data, the method includes generating a corresponding augmented out-of-domain training utterance, and generating, using a teacher ASR model trained on training data corresponding to a target domain, a pseudo-label corresponding to the corresponding augmented out-of-domain training utterance. The method also includes distilling a student ASR model from the teacher ASR model by training the student ASR model using the corresponding augmented out-of-domain training utterances paired with the corresponding pseudo-labels generated by the teacher ASR model.
-
公开(公告)号:US20240135918A1
公开(公告)日:2024-04-25
申请号:US18488578
申请日:2023-10-16
Applicant: Google LLC
Inventor: Tien-Ju Yang , You-Chi Cheng , Shankar Kumar , Jared Lichtarge , Ehsan Amid , Yuxin Ding , Rajiv Mathews , Mingqing Chen
IPC: G10L15/06 , G10L15/197 , G10L15/30
CPC classification number: G10L15/063 , G10L15/197 , G10L15/30 , G10L2015/0635
Abstract: A method includes receiving distillation data including a plurality of out-of-domain training utterances. For each particular out-of-domain training utterance of the distillation data, the method includes generating a corresponding augmented out-of-domain training utterance, and generating, using a teacher ASR model trained on training data corresponding to a target domain, a pseudo-label corresponding to the corresponding augmented out-of-domain training utterance. The method also includes distilling a student ASR model from the teacher ASR model by training the student ASR model using the corresponding augmented out-of-domain training utterances paired with the corresponding pseudo-labels generated by the teacher ASR model.
-
公开(公告)号:US20250118293A1
公开(公告)日:2025-04-10
申请号:US18891615
申请日:2024-09-20
Applicant: Google LLC
Inventor: Mingqing Chen , Rajiv Mathews , Andrew Hard , Swaroop Ramaswamy , Kilol Gupta
Abstract: A method includes receiving a conversational training dataset including a plurality of conversational training samples, each training sample associated with a corresponding conversation and including: corresponding audio data characterizing a corresponding current utterance spoken by a user during a current turn in the corresponding conversation; a corresponding context for the corresponding current utterance including a transcript of a previous turn in the corresponding conversation that precedes the current turn; a corresponding ground-truth transcription of the corresponding current utterance; and a CoT annotation representing a corresponding logical relationship between the corresponding current utterance and the previous turn. The method also includes, for each corresponding conversational training sample in the conversational training dataset, training a speech model on the corresponding conversational training sample to teach the speech model to learn how to predict the corresponding logical relationship from the corresponding audio data and the corresponding context.
-
公开(公告)号:US20240386318A1
公开(公告)日:2024-11-21
申请号:US18386431
申请日:2023-11-02
Applicant: GOOGLE LLC
Inventor: Yuxin Ding , Lillian Zhou , Mingqing Chen , Rajiv Mathews , Andrew Hard , Sean Augenstein
IPC: G06N20/00
Abstract: Implementations described herein are directed to techniques for mitigating and/or eliminating catastrophic forgetting of a global machine learning (ML) model during decentralized learning thereof. Remote processor(s) of a remote system can initially train a global ML model based on server data that is accessible by the remote system. In subsequent decentralized learning of the global ML model, the remote processor(s) can utilize various checkpoint averaging techniques. As described herein, these various checkpoint averaging techniques can include, but are not limited to, a static checkpoint averaging technique, a dynamic checkpoint averaging techniques, and/or a mixed centralized and decentralized training technique.
-
7.
公开(公告)号:US20240265269A1
公开(公告)日:2024-08-08
申请号:US18125613
申请日:2023-03-23
Applicant: GOOGLE LLC
Inventor: Mingqing Chen , Lara McConnaughey , Kaan Ege Özgün , Rajiv Mathews , Françoise Beaufays
Abstract: Implementations disclosed herein are directed to techniques for enabling decentralized learning of global language models (LMs). Remote processor(s) of a remote system can obtain a global LM that includes a global embedding matrix, generate a global embedding mask for the global embedding matrix using a masking technique, apply the global embedding mask to global embedding matrix to generate a sparsified global LM that includes a masked global embedding matrix that is a masked version of the global embedding matrix, transmit the sparsified global LM to computing device(s) that are participating in a given round of decentralized learning for the global language model, receive corresponding updates from the computing device(s), and cause the global LM to be updated based on the corresponding updates. By generating the global embedding mask and applying it to the global embedding matrix, the transferable size of the global LM is reduced thereby enabling decentralized learning thereof.
-
8.
公开(公告)号:US20240194192A1
公开(公告)日:2024-06-13
申请号:US18078782
申请日:2022-12-09
Applicant: GOOGLE LLC
Inventor: Ehsan Amid , Rajiv Mathews , Shankar Kumar , Jared Lichtarge , Mingqing Chen , Tien-Ju Yang , Yuxin Ding
CPC classification number: G10L15/16 , G10L15/063
Abstract: Information can be distilled from a global automatic speech recognition (ASR) model to a client ASR model. Many implementations include using an RNN-T model as the ASR model, where the global ASR model includes a global encoder, a joint network, a prediction network, and where the client ASR model includes a client encoder, the joint network, and the prediction network. Various implementations include using principal component analysis (PCA) while training the global ASR model to learn a mean vector and a set of principal components corresponding to the global ASR model. Additional or alternative implementations include training the client ASR model to generate one or more predicted coefficients of the global ASR model.
-
公开(公告)号:US20230214642A1
公开(公告)日:2023-07-06
申请号:US17568933
申请日:2022-01-05
Applicant: Google LLC
Inventor: Hakim Sidahmed , Zheng Xu , Mingqing Chen , Yuan Cao , Ankush Garg
IPC: G06N3/08
CPC classification number: G06N3/08
Abstract: Example aspects of the present disclosure provide a novel, resource-efficient approach for federated machine learning techniques with PTNs. The system can determine a first set of training parameters from a plurality of parameters of the global model. Additionally, the system can generate a random seed, using a random number generator, based on a set of frozen parameters. Moreover, the system can transmit, respectively to a plurality of client computing devices, a first set of training parameters and the random seed. Furthermore, the system can receive, respectively from the plurality of client computing devices, updates to one or more parameters in the first set of training parameters. Subsequently, the system can aggregate the updates to one or more parameters that are respectively received from the plurality of client computing devices. The system can modify one or more global parameters of the global model based on the aggregation.
-
-
-
-
-
-
-
-