Patent search ap:("Google LLC") AND inv:"Rajiv Mathews" Page 3

21.

发明公开
Knowledge Distillation with Domain Mismatch For Speech Recognition 审中-公开

公开(公告)号：US20240233707A9

公开(公告)日：2024-07-11

申请号：US18488578

申请日：2023-10-17

Applicant: Google LLC

Inventor： Tien-Ju Yang , You-Chi Cheng , Shankar Kumar , Jared Lichtarge , Ehsan Amid , Yuxin Ding , Rajiv Mathews , Mingqing Chen

IPC: G10L15/06 , G10L15/197 , G10L15/30

CPC classification number: G10L15/063 , G10L15/197 , G10L15/30 , G10L2015/0635

Abstract: A method includes receiving distillation data including a plurality of out-of-domain training utterances. For each particular out-of-domain training utterance of the distillation data, the method includes generating a corresponding augmented out-of-domain training utterance, and generating, using a teacher ASR model trained on training data corresponding to a target domain, a pseudo-label corresponding to the corresponding augmented out-of-domain training utterance. The method also includes distilling a student ASR model from the teacher ASR model by training the student ASR model using the corresponding augmented out-of-domain training utterances paired with the corresponding pseudo-labels generated by the teacher ASR model.

22.

发明授权
Using corrections, of automated assistant functions, for training of on-device machine learning models 有权

公开(公告)号：US12014739B2

公开(公告)日：2024-06-18

申请号：US18218818

申请日：2023-07-06

Applicant: GOOGLE LLC

Inventor： Françoise Beaufays , Rajiv Mathews , Dragan Zivkovic , Kurt Partridge , Andrew Hard

IPC: G10L15/22 , G10L15/065 , G10L15/10 , G10L15/30

CPC classification number: G10L15/22 , G10L15/065 , G10L15/10 , G10L15/30

Abstract: Processor(s) of a client device can: receive sensor data that captures environmental attributes of an environment of the client device; process the sensor data using a machine learning model to generate a predicted output that dictates whether one or more currently dormant automated assistant functions are activated; making a decision as to whether to trigger the one or more currently dormant automated assistant functions; subsequent to making the decision, determining that the decision was incorrect; and in response to determining that the determination was incorrect, generating a gradient based on comparing the predicted output to ground truth output. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

23.

发明公开
Knowledge Distillation with Domain Mismatch For Speech Recognition 审中-公开

公开(公告)号：US20240135918A1

公开(公告)日：2024-04-25

申请号：US18488578

申请日：2023-10-16

Applicant: Google LLC

Inventor： Tien-Ju Yang , You-Chi Cheng , Shankar Kumar , Jared Lichtarge , Ehsan Amid , Yuxin Ding , Rajiv Mathews , Mingqing Chen

IPC: G10L15/06 , G10L15/197 , G10L15/30

CPC classification number: G10L15/063 , G10L15/197 , G10L15/30 , G10L2015/0635

Abstract: A method includes receiving distillation data including a plurality of out-of-domain training utterances. For each particular out-of-domain training utterance of the distillation data, the method includes generating a corresponding augmented out-of-domain training utterance, and generating, using a teacher ASR model trained on training data corresponding to a target domain, a pseudo-label corresponding to the corresponding augmented out-of-domain training utterance. The method also includes distilling a student ASR model from the teacher ASR model by training the student ASR model using the corresponding augmented out-of-domain training utterances paired with the corresponding pseudo-labels generated by the teacher ASR model.

24.

发明公开
LEVERAGING INTERMEDIATE CHECKPOINTS TO IMPROVE THE PERFORMANCE OF TRAINED DIFFERENTIALLY PRIVATE MODELS 审中-公开

公开(公告)号：US20240095594A1

公开(公告)日：2024-03-21

申请号：US18459354

申请日：2023-08-31

Applicant: Google LLC

Inventor： Om Dipakbhai Thakkar , Arun Ganesh , Virat Vishnu Shejwalkar , Abhradeep Guha Thakurta , Rajiv Mathews

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: A method includes training a first differentially private (DP) model using a private training set, the private training set including a plurality of training samples, the first DP model satisfying a differential privacy budget, the differential privacy budget defining an amount of information about individual training samples of the private training set that may be revealed by the first DP model. The method also includes, while training the first DP model, generating a plurality of intermediate checkpoints, each intermediate checkpoint of the plurality of intermediate checkpoints representing a different intermediate state of the first DP model, each of the intermediate checkpoints satisfying the same differential privacy budget. The method further includes determining an aggregate of the first DP model and the plurality of intermediate checkpoints, and determining, using the aggregate, a second DP model, the second DP model satisfying the same differential privacy budget.

25.

发明公开
GENERATING AND/OR UTILIZING UNINTENTIONAL MEMORIZATION MEASURE(S) FOR AUTOMATIC SPEECH RECOGNITION MODEL(S) 审中-公开

公开(公告)号：US20230317082A1

公开(公告)日：2023-10-05

申请号：US17710137

申请日：2022-03-31

Applicant: GOOGLE LLC

Inventor： Om Dipakbhai Thakkar , Hakim Sidahmed , W. Ronny Huang , Rajiv Mathews , Françoise Beaufays , Florian Tramèr

IPC: G10L15/26 , G10L15/06 , G10L13/02

CPC classification number: G10L15/26 , G10L15/063 , G10L13/02

Abstract: An unintentional memorization measure can be used to determine whether an automatic speech recognition (ASR) model has unintentionally memorized one or more phrases during training of the ASR model. Various implementations include generating one or more candidate transcripts based on the vocabulary of the ASR model. For example, the system can generate a candidate transcript by appending a token of the vocabulary to a previous candidate transcript. Various implementations include processing the candidate transcript using a speech synthesis model to generate synthesized speech audio data that includes synthesized speech of the candidate transcript. Additionally or alternatively, the synthesized speech audio data can be processed using the ASR model to generate ASR output. Various implementations can include generating a loss based on comparing the ASR output and the candidate transcript.

26.

发明授权
Using corrections, of automated assistant functions, for training of on-device machine learning models 有权

公开(公告)号：US11741953B2

公开(公告)日：2023-08-29

申请号：US16973572

申请日：2019-11-08

Applicant: Google LLC

Inventor： Françoise Beaufays , Rajiv Mathews , Dragan Zivkovic , Kurt Partridge , Andrew Hard

IPC: G10L15/22 , G10L15/065 , G10L15/10 , G10L15/30

CPC classification number: G10L15/22 , G10L15/065 , G10L15/10 , G10L15/30

Abstract: Processor(s) of a client device can: receive sensor data that captures environmental attributes of an environment of the client device; process the sensor data using a machine learning model to generate a predicted output that dictates whether one or more currently dormant automated assistant functions are activated; making a decision as to whether to trigger the one or more currently dormant automated assistant functions; subsequent to making the decision, determining that the decision was incorrect; and in response to determining that the determination was incorrect, generating a gradient based on comparing the predicted output to ground truth output. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

27.

发明申请
UTILIZING ELASTIC WEIGHT CONSOLIDATION (EWC) LOSS TERM(S) TO MITIGATE CATASTROPHIC FORGETTING IN TRAINING MACHINE LEARNING MODEL(S) 有权

公开(公告)号：US20250045627A1

公开(公告)日：2025-02-06

申请号：US18365487

申请日：2023-08-04

Applicant: GOOGLE LLC

Inventor： Andrew Hard , Kurt Partridge , Sean Augenstein , Rajiv Mathews

IPC: G06N20/00

Abstract: Processor(s) of a client device can receive global weights of a global ML model from a remote system, obtain a client device data set, determine a Fisher information matrix for the client data set, and transmit the Fisher information matrix for the client data set to the remote system. Further, processor(s) of the remote system can determine a corresponding elastic weight consolidation (EWC) loss term for each of the global weights based on at least the Fisher information matrix, generate a server update for the global ML model based on (i) processing server data remotely at the remote system and using the global ML model and (ii) based on the corresponding EWC loss term for each of the global weights, and update the global weights of the global ML model based on the server update.

28.

发明授权
Mixed client-server federated learning of machine learning model(s) 有权

公开(公告)号：US12205575B2

公开(公告)日：2025-01-21

申请号：US18218319

申请日：2023-07-05

Applicant: GOOGLE LLC

Inventor： Françoise Beaufays , Andrew Hard , Swaroop Indra Ramaswamy , Om Dipakbhai Thakkar , Rajiv Mathews

IPC: G10L15/065 , G10L13/04 , G10L15/26 , G10L15/30

Abstract: Implementations disclosed herein are directed to federated learning of machine learning (“ML”) model(s) based on gradient(s) generated at corresponding client devices and a remote system. Processor(s) of the corresponding client devices can process client data generated locally at the corresponding client devices using corresponding on-device ML model(s) to generate corresponding predicted outputs, generate corresponding client gradients based on the corresponding predicted outputs, and transmit the corresponding client gradients to the remote system. Processor(s) of the remote system can process remote data obtained from remote database(s) using global ML model(s) to generate additional corresponding predicted outputs, generate corresponding remote gradients based on the additional corresponding predicted outputs. Further, the remote system can utilize the corresponding client gradients and the corresponding remote gradients to update the global ML model(s) or weights thereof. The updated global ML model(s) and/or the updated weights thereof can be transmitted back to the corresponding client devices.

29.

发明公开
FILTERING FOR MIXING SERVER-BASED AND FEDERATED LEARNING 审中-公开

公开(公告)号：US20240330766A1

公开(公告)日：2024-10-03

申请号：US18609704

申请日：2024-03-19

Applicant: Google LLC

Inventor： Andrew Hard , Rajiv Mathews

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: A method includes receiving, from a client device, a client machine learning (ML) model and obtaining a set of training data including a plurality of training samples. The client ML model is trained locally on the client device. For each respective training sample in the plurality of training samples, the method also includes determining, using the respective training sample, a first loss of the client ML model; determining, using the respective training sample, a second loss of a server machine learning (ML) model; and determining a respective score based on the first loss and the second loss. The method also includes selecting, based on each respective score of each respective training sample in the plurality of training samples, a subset of training samples from the plurality of training samples and training the server ML model using the subset of training samples.

30.

发明公开
Heterogeneous Federated Learning Via Multi-Directional Knowledge Distillation 审中-公开

公开(公告)号：US20240249193A1

公开(公告)日：2024-07-25

申请号：US18417947

申请日：2024-01-19

Applicant: Google LLC

Inventor： Jared Alexander Lichtarge , Rajiv Mathews , Rohan Anil , Ehsan Amid , Shankar Kumar

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: Generally, the present disclosure is directed to enhanced federated learning (FL) that employs a set of clients with varying amounts of computational resources (e.g., system memory, storage, and processing bandwidth). To overcome limitations of conventional FL methods that employ a set of clients with varying amounts of computational resources, the embodiments run multi-directional knowledge distillation between the server models produced by each federated averaging (FedAvg) pool, using unlabeled server data as the distillation dataset. By co-distilling the two (or more) models frequently over the course of FedAvg rounds, information is shared between the pools without sharing model parameters. This leads to increased performance and faster convergence (in fewer federated rounds).

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification