Using corrections, of automated assistant functions, for training of on-device machine learning models

    公开(公告)号:US12014739B2

    公开(公告)日:2024-06-18

    申请号:US18218818

    申请日:2023-07-06

    Applicant: GOOGLE LLC

    CPC classification number: G10L15/22 G10L15/065 G10L15/10 G10L15/30

    Abstract: Processor(s) of a client device can: receive sensor data that captures environmental attributes of an environment of the client device; process the sensor data using a machine learning model to generate a predicted output that dictates whether one or more currently dormant automated assistant functions are activated; making a decision as to whether to trigger the one or more currently dormant automated assistant functions; subsequent to making the decision, determining that the decision was incorrect; and in response to determining that the determination was incorrect, generating a gradient based on comparing the predicted output to ground truth output. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

    LEVERAGING INTERMEDIATE CHECKPOINTS TO IMPROVE THE PERFORMANCE OF TRAINED DIFFERENTIALLY PRIVATE MODELS

    公开(公告)号:US20240095594A1

    公开(公告)日:2024-03-21

    申请号:US18459354

    申请日:2023-08-31

    Applicant: Google LLC

    CPC classification number: G06N20/00

    Abstract: A method includes training a first differentially private (DP) model using a private training set, the private training set including a plurality of training samples, the first DP model satisfying a differential privacy budget, the differential privacy budget defining an amount of information about individual training samples of the private training set that may be revealed by the first DP model. The method also includes, while training the first DP model, generating a plurality of intermediate checkpoints, each intermediate checkpoint of the plurality of intermediate checkpoints representing a different intermediate state of the first DP model, each of the intermediate checkpoints satisfying the same differential privacy budget. The method further includes determining an aggregate of the first DP model and the plurality of intermediate checkpoints, and determining, using the aggregate, a second DP model, the second DP model satisfying the same differential privacy budget.

    Using corrections, of automated assistant functions, for training of on-device machine learning models

    公开(公告)号:US11741953B2

    公开(公告)日:2023-08-29

    申请号:US16973572

    申请日:2019-11-08

    Applicant: Google LLC

    CPC classification number: G10L15/22 G10L15/065 G10L15/10 G10L15/30

    Abstract: Processor(s) of a client device can: receive sensor data that captures environmental attributes of an environment of the client device; process the sensor data using a machine learning model to generate a predicted output that dictates whether one or more currently dormant automated assistant functions are activated; making a decision as to whether to trigger the one or more currently dormant automated assistant functions; subsequent to making the decision, determining that the decision was incorrect; and in response to determining that the determination was incorrect, generating a gradient based on comparing the predicted output to ground truth output. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

    UTILIZING ELASTIC WEIGHT CONSOLIDATION (EWC) LOSS TERM(S) TO MITIGATE CATASTROPHIC FORGETTING IN TRAINING MACHINE LEARNING MODEL(S)

    公开(公告)号:US20250045627A1

    公开(公告)日:2025-02-06

    申请号:US18365487

    申请日:2023-08-04

    Applicant: GOOGLE LLC

    Abstract: Processor(s) of a client device can receive global weights of a global ML model from a remote system, obtain a client device data set, determine a Fisher information matrix for the client data set, and transmit the Fisher information matrix for the client data set to the remote system. Further, processor(s) of the remote system can determine a corresponding elastic weight consolidation (EWC) loss term for each of the global weights based on at least the Fisher information matrix, generate a server update for the global ML model based on (i) processing server data remotely at the remote system and using the global ML model and (ii) based on the corresponding EWC loss term for each of the global weights, and update the global weights of the global ML model based on the server update.

    Mixed client-server federated learning of machine learning model(s)

    公开(公告)号:US12205575B2

    公开(公告)日:2025-01-21

    申请号:US18218319

    申请日:2023-07-05

    Applicant: GOOGLE LLC

    Abstract: Implementations disclosed herein are directed to federated learning of machine learning (“ML”) model(s) based on gradient(s) generated at corresponding client devices and a remote system. Processor(s) of the corresponding client devices can process client data generated locally at the corresponding client devices using corresponding on-device ML model(s) to generate corresponding predicted outputs, generate corresponding client gradients based on the corresponding predicted outputs, and transmit the corresponding client gradients to the remote system. Processor(s) of the remote system can process remote data obtained from remote database(s) using global ML model(s) to generate additional corresponding predicted outputs, generate corresponding remote gradients based on the additional corresponding predicted outputs. Further, the remote system can utilize the corresponding client gradients and the corresponding remote gradients to update the global ML model(s) or weights thereof. The updated global ML model(s) and/or the updated weights thereof can be transmitted back to the corresponding client devices.

    FILTERING FOR MIXING SERVER-BASED AND FEDERATED LEARNING

    公开(公告)号:US20240330766A1

    公开(公告)日:2024-10-03

    申请号:US18609704

    申请日:2024-03-19

    Applicant: Google LLC

    CPC classification number: G06N20/00

    Abstract: A method includes receiving, from a client device, a client machine learning (ML) model and obtaining a set of training data including a plurality of training samples. The client ML model is trained locally on the client device. For each respective training sample in the plurality of training samples, the method also includes determining, using the respective training sample, a first loss of the client ML model; determining, using the respective training sample, a second loss of a server machine learning (ML) model; and determining a respective score based on the first loss and the second loss. The method also includes selecting, based on each respective score of each respective training sample in the plurality of training samples, a subset of training samples from the plurality of training samples and training the server ML model using the subset of training samples.

    Heterogeneous Federated Learning Via Multi-Directional Knowledge Distillation

    公开(公告)号:US20240249193A1

    公开(公告)日:2024-07-25

    申请号:US18417947

    申请日:2024-01-19

    Applicant: Google LLC

    CPC classification number: G06N20/00

    Abstract: Generally, the present disclosure is directed to enhanced federated learning (FL) that employs a set of clients with varying amounts of computational resources (e.g., system memory, storage, and processing bandwidth). To overcome limitations of conventional FL methods that employ a set of clients with varying amounts of computational resources, the embodiments run multi-directional knowledge distillation between the server models produced by each federated averaging (FedAvg) pool, using unlabeled server data as the distillation dataset. By co-distilling the two (or more) models frequently over the course of FedAvg rounds, information is shared between the pools without sharing model parameters. This leads to increased performance and faster convergence (in fewer federated rounds).

Patent Agency Ranking