Lookup-Table Recurrent Language Model

    公开(公告)号:US20220310067A1

    公开(公告)日:2022-09-29

    申请号:US17650566

    申请日:2022-02-10

    Applicant: Google LLC

    Abstract: A computer-implemented method includes receiving audio data that corresponds to an utterance spoken by a user and captured by a user device. The method also includes processing the audio data to determine a candidate transcription that includes a sequence of tokens for the spoken utterance. Tor each token in the sequence of tokens, the method includes determining a token embedding for corresponding token, determining a n-gram token embedding for a previous sequence of n-gram tokens, and concatenating the token embedding and the n-gram token embedding to generate a concatenated output for the corresponding token. The method also includes rescoring the candidate transcription for the spoken utterance by processing the concatenated output generated for each corresponding token in the sequence of tokens.

    Modeling Ambiguity in Neural Machine Translation

    公开(公告)号:US20230351125A1

    公开(公告)日:2023-11-02

    申请号:US18089684

    申请日:2022-12-28

    Applicant: Google LLC

    CPC classification number: G06F40/58 G06F40/284

    Abstract: The technology addresses ambiguity in neural machine translation. An encoder module receives a given text exemplar and generates an encoded representation of it. A decoder module receives the encoded representation and a set of translation prefixes. The decoder module outputs an unbounded function corresponding to a set of tokens associated with each pair of the given text exemplar and translation prefix from the set of translation prefixes. Each token is assigned a probability between 0 and 1 in a vocabulary of the exemplar at each time step. A logits module generates, based on the unbounded function, a corresponding bounded conditional probability for each token, wherein the probabilities are not normalized over the vocabulary at each time step. A loss function module having a positive loss component and a scaled negative loss component identifies whether each target text of a set of target texts is a valid translation of the exemplar.

    Heterogeneous Federated Learning Via Multi-Directional Knowledge Distillation

    公开(公告)号:US20240249193A1

    公开(公告)日:2024-07-25

    申请号:US18417947

    申请日:2024-01-19

    Applicant: Google LLC

    CPC classification number: G06N20/00

    Abstract: Generally, the present disclosure is directed to enhanced federated learning (FL) that employs a set of clients with varying amounts of computational resources (e.g., system memory, storage, and processing bandwidth). To overcome limitations of conventional FL methods that employ a set of clients with varying amounts of computational resources, the embodiments run multi-directional knowledge distillation between the server models produced by each federated averaging (FedAvg) pool, using unlabeled server data as the distillation dataset. By co-distilling the two (or more) models frequently over the course of FedAvg rounds, information is shared between the pools without sharing model parameters. This leads to increased performance and faster convergence (in fewer federated rounds).

    HYBRID FEDERATED LEARNING OF MACHINE LEARNING MODEL(S)

    公开(公告)号:US20240070530A1

    公开(公告)日:2024-02-29

    申请号:US18074729

    申请日:2022-12-05

    Applicant: GOOGLE LLC

    CPC classification number: G06N20/00

    Abstract: Implementations disclosed herein are directed to a hybrid federated learning (FL) technique that utilizes both federated averaging (FA) and federated distillation (FD) during a given round of FL of a given global machine learning (ML) model. Implementations may identify a population of client devices to participate in the given round of FL, determine a corresponding quantity of instances of client data available at each of the client devices that may be utilized during the given round of FL, and select different subsets of the client devices based on the corresponding quantity of instances of client data. Further, implementations may cause a first subset of the client devices to generate a corresponding FA update and a second subset of client devices to generate a corresponding FD update. Moreover, implementations may subsequently update the given global ML model based on the corresponding FA updates and the corresponding FD updates.

    Semantic Segmentation With Language Models For Long-Form Automatic Speech Recognition

    公开(公告)号:US20240290320A1

    公开(公告)日:2024-08-29

    申请号:US18585020

    申请日:2024-02-22

    Applicant: Google LLC

    CPC classification number: G10L15/063 G06F40/30 G10L15/26

    Abstract: A joint segmenting and ASR model includes an encoder to receive a sequence of acoustic frames and generate, at each of a plurality of output steps, a higher order feature representation for a corresponding acoustic frame. The model also includes a decoder to generate based on the higher order feature representation at each of the plurality of output steps a probability distribution over possible speech recognition hypotheses, and an indication of whether the corresponding output step corresponds to an end of segment (EOS). The model is trained on a set of training samples, each training sample including audio data characterizing multiple segments of long-form speech; and a corresponding transcription of the long-form speech, the corresponding transcription annotated with ground-truth EOS labels obtained via distillation from a language model teacher that receives the corresponding transcription as input and injects the ground-truth EOS labels into the corresponding transcription between semantically complete segments.

    ON-DEVICE GRAMMAR CHECKING
    8.
    发明公开

    公开(公告)号:US20230359818A1

    公开(公告)日:2023-11-09

    申请号:US18246326

    申请日:2020-12-18

    Applicant: Google LLC

    CPC classification number: G06F40/253

    Abstract: A computing device may receive inputted text and perform, using one or more neural networks, on-device grammar checking of a sequence of words in the inputted text, including determining, using the one or more neural networks, a grammatically correct version of the sequence of words and determining that the sequence of words does not match the grammatically correct version of the sequence of words. The computing device may, in response to determining that the sequence of words does not match the grammatically correct version of the sequence of words, output, for display at a display device, at least a portion of the grammatically correct version of the sequence of words as a suggested replacement for at least a sequence of the sequence of words in the inputted text.

Patent Agency Ranking