Patent search ap:("Google LLC") AND inv:"Rajiv Mathews" Page 2

11.

发明申请
CHECKPOINT AVERAGING TO MITIGATE AND/OR ELIMINATE CATASTROPHIC FORGETTING OF MACHINE LEARNING MODEL(S) IN DECENTRALIZED LEARNING THEREOF 有权

公开(公告)号：US20240386318A1

公开(公告)日：2024-11-21

申请号：US18386431

申请日：2023-11-02

Applicant: GOOGLE LLC

Inventor： Yuxin Ding , Lillian Zhou , Mingqing Chen , Rajiv Mathews , Andrew Hard , Sean Augenstein

IPC: G06N20/00

Abstract: Implementations described herein are directed to techniques for mitigating and/or eliminating catastrophic forgetting of a global machine learning (ML) model during decentralized learning thereof. Remote processor(s) of a remote system can initially train a global ML model based on server data that is accessible by the remote system. In subsequent decentralized learning of the global ML model, the remote processor(s) can utilize various checkpoint averaging techniques. As described herein, these various checkpoint averaging techniques can include, but are not limited to, a static checkpoint averaging technique, a dynamic checkpoint averaging techniques, and/or a mixed centralized and decentralized training technique.

12.

发明公开
CO-DISTILLATION FOR MIXING SERVER-BASED AND FEDERATED LEARNING 审中-公开

公开(公告)号：US20240330767A1

公开(公告)日：2024-10-03

申请号：US18611628

申请日：2024-03-20

Applicant: Google LLC

Inventor： Andrew Hard , Rajiv Mathews

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: A method includes training a client machine learning (ML) model on client training data at a client device. While training the client ML model, the method also includes obtaining, from a server, server model weights of a server ML model trained on server training data, the server training data different that the client training data. While training the client ML model, the method also includes: transmitting, to the server, client model weights of the client ML model; updating the client ML model using the server model weights; obtaining, from the server, updated server model weights of the server ML model, the updated server model weights updated based on the transmitted client model weights; and further updating the client ML model using the updated server model weights.

13.

发明公开
SYSTEM(S) AND METHOD(S) TO REDUCE A TRANSFERABLE SIZE OF LANGUAGE MODEL(S) TO ENABLE DECENTRALIZED LEARNING THEREOF 审中-公开

公开(公告)号：US20240265269A1

公开(公告)日：2024-08-08

申请号：US18125613

申请日：2023-03-23

Applicant: GOOGLE LLC

Inventor： Mingqing Chen , Lara McConnaughey , Kaan Ege Özgün , Rajiv Mathews , Françoise Beaufays

IPC: G06N3/098 , G06F40/40 , G06N3/044

CPC classification number: G06N3/098 , G06F40/40 , G06N3/044

Abstract: Implementations disclosed herein are directed to techniques for enabling decentralized learning of global language models (LMs). Remote processor(s) of a remote system can obtain a global LM that includes a global embedding matrix, generate a global embedding mask for the global embedding matrix using a masking technique, apply the global embedding mask to global embedding matrix to generate a sparsified global LM that includes a masked global embedding matrix that is a masked version of the global embedding matrix, transmit the sparsified global LM to computing device(s) that are participating in a given round of decentralized learning for the global language model, receive corresponding updates from the computing device(s), and cause the global LM to be updated based on the corresponding updates. By generating the global embedding mask and applying it to the global embedding matrix, the transferable size of the global LM is reduced thereby enabling decentralized learning thereof.

14.

发明公开
FEDERATED KNOWLEDGE DISTILLATION ON AN ENCODER OF A GLOBAL ASR MODEL AND/OR AN ENCODER OF A CLIENT ASR MODEL 审中-公开

公开(公告)号：US20240194192A1

公开(公告)日：2024-06-13

申请号：US18078782

申请日：2022-12-09

Applicant: GOOGLE LLC

Inventor： Ehsan Amid , Rajiv Mathews , Shankar Kumar , Jared Lichtarge , Mingqing Chen , Tien-Ju Yang , Yuxin Ding

IPC: G10L15/16 , G10L15/06

CPC classification number: G10L15/16 , G10L15/063

Abstract: Information can be distilled from a global automatic speech recognition (ASR) model to a client ASR model. Many implementations include using an RNN-T model as the ASR model, where the global ASR model includes a global encoder, a joint network, a prediction network, and where the client ASR model includes a client encoder, the joint network, and the prediction network. Various implementations include using principal component analysis (PCA) while training the global ASR model to learn a mean vector and a set of principal components corresponding to the global ASR model. Additional or alternative implementations include training the client ASR model to generate one or more predicted coefficients of the global ASR model.

15.

发明公开
SYSTEM(S) AND METHOD(S) FOR JOINTLY LEARNING MACHINE LEARNING MODEL(S) BASED ON SERVER DATA AND CLIENT DATA 审中-公开

公开(公告)号：US20230359907A1

公开(公告)日：2023-11-09

申请号：US17848947

申请日：2022-07-01

Applicant: GOOGLE LLC

Inventor： Sean Augenstein , Andrew Hard , Kurt Partridge , Rajiv Mathews , Lin Ning , Karan Singhal

IPC: G06N5/02

CPC classification number: G06N5/022

Abstract: Implementations disclosed herein are directed to various techniques for mitigating and/or preventing catastrophic forgetting in federated learning of global machine learning (ML) models. Implementations may identify a global ML model that is initially trained at a remote server based on a server data set, determine server-based data for global weight(s) of the global ML model, and transmit the global ML model and the server-based data to a plurality of client devices. The server-based data may include, for example, EWC loss term(s), client augmenting gradients, server augmenting gradients, and/or server-based data. Further, the plurality client devices may generate, based on processing corresponding predicted output and using the global ML model, and based on the server-based data, a corresponding client gradient, and transmit the corresponding client gradient to the remote server. Implementations may further generate an updated global ML model based on at least the corresponding client gradients.

16.

发明公开
UTILIZING ELASTIC WEIGHT CONSOLIDATION (EWC) LOSS TERM(S) TO MITIGATE CATASTROPHIC FORGETTING IN FEDERATED LEARNING OF MACHINE LEARNING MODEL(S) 审中-公开

公开(公告)号：US20230351246A1

公开(公告)日：2023-11-02

申请号：US17734766

申请日：2022-05-02

Applicant: GOOGLE LLC

Inventor： Andrew Hard , Kurt Partridge , Rajiv Mathews , Sean Augenstein

IPC: G06N20/00 , H04L67/10

CPC classification number: G06N20/00 , H04L67/10

Abstract: Implementations disclosed herein are directed to utilizing elastic weight consolidation (EWC) loss term(s) in federated learning of global machine learning (ML) models. Implementations may identify a global ML model that initially trained at a remote server based on a server data set, determine the EWC loss term(s) for global weight(s) of the global ML model, and transmit the global ML model and the EWC loss term(s) to a plurality of client devices. The EWC loss term(s) may be determined based on a Fisher information matrix for the server data set. Further, the plurality client devices may generate, based on processing corresponding predicted output and using the global ML model, and based on the EWC loss term(s), a corresponding client gradient, and transmit the corresponding client gradient to the remote server. Implementations may further generate an updated global ML model based on at least the corresponding client gradients.

17.

发明公开
Detecting Unintended Memorization in Language-Model-Fused ASR Systems 审中-公开

公开(公告)号：US20230335126A1

公开(公告)日：2023-10-19

申请号：US18303296

申请日：2023-04-19

Applicant: Google LLC

Inventor： Ronny Huang , Steve Chien , Om Thakkar , Rajiv Mathews

IPC: G10L15/197 , G10L13/02 , G10L15/01 , G10L15/06 , G10L15/16

CPC classification number: G10L15/197 , G10L13/02 , G10L15/01 , G10L15/063 , G10L15/16

Abstract: A method includes inserting a set of canary text samples into a corpus of training text samples and training an external language model on the corpus of training text samples and the set of canary text samples inserted into the corpus of training text samples. For each canary text sample, the method also includes generating a corresponding synthetic speech utterance and generating an initial transcription for the corresponding synthetic speech utterance. The method also includes rescoring the initial transcription generated for each corresponding synthetic speech utterance using the external language model. The method also includes determining a word error rate (WER) of the external language model based on the rescored initial transcriptions and the canary text samples and detecting memorization of the canary text samples by the external language model based on the WER of the external language model.

18.

发明申请
MIXED CLIENT-SERVER FEDERATED LEARNING OF MACHINE LEARNING MODEL(S) 有权

公开(公告)号：US20220293093A1

公开(公告)日：2022-09-15

申请号：US17197954

申请日：2021-03-10

Applicant: Google LLC

Inventor： Françoise Beaufays , Andrew Hard , Swaroop Indra Ramaswamy , Om Dipakbhai Thakkar , Rajiv Mathews

IPC: G10L15/065 , G10L15/30 , G10L15/26 , G10L13/04

Abstract: Implementations disclosed herein are directed to federated learning of machine learning (“ML”) model(s) based on gradient(s) generated at corresponding client devices and a remote system. Processor(s) of the corresponding client devices can process client data generated locally at the corresponding client devices using corresponding on-device ML model(s) to generate corresponding predicted outputs, generate corresponding client gradients based on the corresponding predicted outputs, and transmit the corresponding client gradients to the remote system. Processor(s) of the remote system can process remote data obtained from remote database(s) using global ML model(s) to generate additional corresponding predicted outputs, generate corresponding remote gradients based on the additional corresponding predicted outputs. Further, the remote system can utilize the corresponding client gradients and the corresponding remote gradients to update the global ML model(s) or weights thereof. The updated global ML model(s) and/or the updated weights thereof can be transmitted back to the corresponding client devices.

19.

发明授权
Generation and utilization of pseudo-correction(s) to prevent forgetting of personalized on-device automatic speech recognition (ASR) model(s) 有权

公开(公告)号：US12223952B2

公开(公告)日：2025-02-11

申请号：US17959637

申请日：2022-10-04

Applicant: GOOGLE LLC

Inventor： Rajiv Mathews , Dragan Zivkovic , Khe Chai Sim

IPC: G10L15/00 , G10L15/06 , G10L15/19 , G10L15/22 , G10L15/30

Abstract: On-device processor(s) of a client device may store, in on-device storage and in association with a time to live (TTL) in the on-device storage, a correction directed to ASR processing of audio data. The correction may include a portion of a given speech hypothesis that was modified to an alternate speech hypothesis. Further, the on-device processor(s) may cause an on-device ASR model to be personalized based on the correction. Moreover, and based on additional ASR processing of additional audio data, the on-device processor(s) may store, in the on-device storage and in association with an additional TTL in the on-device storage, a pseudo-correction directed to the additional ASR processing. Accordingly, the on-device processor(s) may cause the on-device ASR model to be personalized based on the pseudo-correction to prevent forgetting by the on-device ASR model.

20.

发明申请
FLY PARAMETER COMPRESSION AND DECOMPRESSION TO FACILITATE FORWARD AND/OR BACK PROPAGATION AT CLIENTS DURING FEDERATED LEARNING 有权

公开(公告)号：US20240371362A1

公开(公告)日：2024-11-07

申请号：US18652587

申请日：2024-05-01

Applicant: GOOGLE LLC

Inventor： Tien-Ju Yang , Yonghui Xiao , Giovanni Motta , Françoise Beaufays , Rajiv Mathews , Mingqing Chen

IPC: G10L15/06

Abstract: Implementations are directed to efficient federated learning of machine learning (ML) model(s) through on-the-fly decompression and compression of model parameters, of the ML model(s), when facilitating forward propagation and/or back propagation at client device(s). For example, implementations can transmit, from a remote system to a client device, a compressed on-device ML model that includes some compressed parameters. Further, the client device can, in performing forward propagation and/or back propagation using the on-device ML model, decompress those compressed parameters on-the-fly as the parameters are needed for the propagation. The propagation will utilize the decompressed parameters that were decompressed on the fly. Further, after the decompressed parameters are utilized, they can be deallocated from memory (while their compressed counterparts optionally remain in memory) to enable allocation of memory for further decompressed parameters that will be needed next and/or needed for other ongoing process(es).

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification