Patent search ap:("Google LLC") AND inv:"Shankar Kumar" Page 1

1.

发明申请
Lookup-Table Recurrent Language Model 有权

公开(公告)号：US20220310067A1

公开(公告)日：2022-09-29

申请号：US17650566

申请日：2022-02-10

Applicant: Google LLC

Inventor： Ronny Huang , Tara N. Sainath , Trevor Strohman , Shankar Kumar

IPC: G10L15/08 , G10L15/26 , G10L15/187 , G06N3/04 , G10L15/16

Abstract: A computer-implemented method includes receiving audio data that corresponds to an utterance spoken by a user and captured by a user device. The method also includes processing the audio data to determine a candidate transcription that includes a sequence of tokens for the spoken utterance. Tor each token in the sequence of tokens, the method includes determining a token embedding for corresponding token, determining a n-gram token embedding for a previous sequence of n-gram tokens, and concatenating the token embedding and the n-gram token embedding to generate a concatenated output for the corresponding token. The method also includes rescoring the candidate transcription for the spoken utterance by processing the concatenated output generated for each corresponding token in the sequence of tokens.

2.

发明公开
Modeling Ambiguity in Neural Machine Translation 审中-公开

公开(公告)号：US20230351125A1

公开(公告)日：2023-11-02

申请号：US18089684

申请日：2022-12-28

Applicant: Google LLC

Inventor： Felix Stahlberg , Shankar Kumar

IPC: G06F40/58 , G06F40/284

CPC classification number: G06F40/58 , G06F40/284

Abstract: The technology addresses ambiguity in neural machine translation. An encoder module receives a given text exemplar and generates an encoded representation of it. A decoder module receives the encoded representation and a set of translation prefixes. The decoder module outputs an unbounded function corresponding to a set of tokens associated with each pair of the given text exemplar and translation prefix from the set of translation prefixes. Each token is assigned a probability between 0 and 1 in a vocabulary of the exemplar at each time step. A logits module generates, based on the unbounded function, a corresponding bounded conditional probability for each token, wherein the probabilities are not normalized over the vocabulary at each time step. A loss function module having a positive loss component and a scaled negative loss component identifies whether each target text of a set of target texts is a valid translation of the exemplar.

3.

发明公开
Heterogeneous Federated Learning Via Multi-Directional Knowledge Distillation 审中-公开

公开(公告)号：US20240249193A1

公开(公告)日：2024-07-25

申请号：US18417947

申请日：2024-01-19

Applicant: Google LLC

Inventor： Jared Alexander Lichtarge , Rajiv Mathews , Rohan Anil , Ehsan Amid , Shankar Kumar

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: Generally, the present disclosure is directed to enhanced federated learning (FL) that employs a set of clients with varying amounts of computational resources (e.g., system memory, storage, and processing bandwidth). To overcome limitations of conventional FL methods that employ a set of clients with varying amounts of computational resources, the embodiments run multi-directional knowledge distillation between the server models produced by each federated averaging (FedAvg) pool, using unlabeled server data as the distillation dataset. By co-distilling the two (or more) models frequently over the course of FedAvg rounds, information is shared between the pools without sharing model parameters. This leads to increased performance and faster convergence (in fewer federated rounds).

4.

发明公开
Multi-Output Decoders for Multi-Task Learning of ASR and Auxiliary Tasks 审中-公开

公开(公告)号：US20240153495A1

公开(公告)日：2024-05-09

申请号：US18494984

申请日：2023-10-26

Applicant: Google LLC

Inventor： Weiran Wang , Ding Zhao , Shaojin Ding , Hao Zhang , Shuo-yiin Chang , David Johannes Rybach , Tara N. Sainath , Yanzhang He , Ian McGraw , Shankar Kumar

IPC: G10L15/06 , G06F40/284 , G10L15/26

CPC classification number: G10L15/063 , G06F40/284 , G10L15/26

Abstract: A method includes receiving a training dataset that includes one or more spoken training utterances for training an automatic speech recognition (ASR) model. Each spoken training utterance in the training dataset paired with a corresponding transcription and a corresponding target sequence of auxiliary tokens. For each spoken training utterance, the method includes generating a speech recognition hypothesis for a corresponding spoken training utterance, determining a speech recognition loss based on the speech recognition hypothesis and the corresponding transcription, generating a predicted auxiliary token for the corresponding spoken training utterance, and determining an auxiliary task loss based on the predicted auxiliary token and the corresponding target sequence of auxiliary tokens. The method also includes the ASR model jointly on the speech recognition loss and the auxiliary task loss determined for each spoken training utterance.

5.

发明公开
HYBRID FEDERATED LEARNING OF MACHINE LEARNING MODEL(S) 审中-公开

公开(公告)号：US20240070530A1

公开(公告)日：2024-02-29

申请号：US18074729

申请日：2022-12-05

Applicant: GOOGLE LLC

Inventor： Ehsan Amid , Rajiv Mathews , Rohan Anil , Shankar Kumar , Jared Lichtarge

IPC: G06N20/00

CPC classification number: G06N20/00

Abstract: Implementations disclosed herein are directed to a hybrid federated learning (FL) technique that utilizes both federated averaging (FA) and federated distillation (FD) during a given round of FL of a given global machine learning (ML) model. Implementations may identify a population of client devices to participate in the given round of FL, determine a corresponding quantity of instances of client data available at each of the client devices that may be utilized during the given round of FL, and select different subsets of the client devices based on the corresponding quantity of instances of client data. Further, implementations may cause a first subset of the client devices to generate a corresponding FA update and a second subset of client devices to generate a corresponding FD update. Moreover, implementations may subsequently update the given global ML model based on the corresponding FA updates and the corresponding FD updates.

6.

发明公开
Semantic Segmentation With Language Models For Long-Form Automatic Speech Recognition 审中-公开

公开(公告)号：US20240290320A1

公开(公告)日：2024-08-29

申请号：US18585020

申请日：2024-02-22

Applicant: Google LLC

Inventor： Wenqian Huang , Hao Zhang , Shankar Kumar , Shuo-yiin Chang , Tara N. Sainath

IPC: G10L15/06 , G06F40/30 , G10L15/26

CPC classification number: G10L15/063 , G06F40/30 , G10L15/26

Abstract: A joint segmenting and ASR model includes an encoder to receive a sequence of acoustic frames and generate, at each of a plurality of output steps, a higher order feature representation for a corresponding acoustic frame. The model also includes a decoder to generate based on the higher order feature representation at each of the plurality of output steps a probability distribution over possible speech recognition hypotheses, and an indication of whether the corresponding output step corresponds to an end of segment (EOS). The model is trained on a set of training samples, each training sample including audio data characterizing multiple segments of long-form speech; and a corresponding transcription of the long-form speech, the corresponding transcription annotated with ground-truth EOS labels obtained via distillation from a language model teacher that receives the corresponding transcription as input and injects the ground-truth EOS labels into the corresponding transcription between semantically complete segments.

7.

发明公开
FEDERATED KNOWLEDGE DISTILLATION ON AN ENCODER OF A GLOBAL ASR MODEL AND/OR AN ENCODER OF A CLIENT ASR MODEL 审中-公开

公开(公告)号：US20240194192A1

公开(公告)日：2024-06-13

申请号：US18078782

申请日：2022-12-09

Applicant: GOOGLE LLC

Inventor： Ehsan Amid , Rajiv Mathews , Shankar Kumar , Jared Lichtarge , Mingqing Chen , Tien-Ju Yang , Yuxin Ding

IPC: G10L15/16 , G10L15/06

CPC classification number: G10L15/16 , G10L15/063

Abstract: Information can be distilled from a global automatic speech recognition (ASR) model to a client ASR model. Many implementations include using an RNN-T model as the ASR model, where the global ASR model includes a global encoder, a joint network, a prediction network, and where the client ASR model includes a client encoder, the joint network, and the prediction network. Various implementations include using principal component analysis (PCA) while training the global ASR model to learn a mean vector and a set of principal components corresponding to the global ASR model. Additional or alternative implementations include training the client ASR model to generate one or more predicted coefficients of the global ASR model.

8.

发明公开
ON-DEVICE GRAMMAR CHECKING 审中-公开

公开(公告)号：US20230359818A1

公开(公告)日：2023-11-09

申请号：US18246326

申请日：2020-12-18

Applicant: Google LLC

Inventor： Matthew Sharifi , Sebastian Millius , Qi Wang , Yunpeng Li , Shankar Kumar , Lukas Zilka , Simon Tong , Martin Sundermeyer

IPC: G06F40/253

CPC classification number: G06F40/253

Abstract: A computing device may receive inputted text and perform, using one or more neural networks, on-device grammar checking of a sequence of words in the inputted text, including determining, using the one or more neural networks, a grammatically correct version of the sequence of words and determining that the sequence of words does not match the grammatically correct version of the sequence of words. The computing device may, in response to determining that the sequence of words does not match the grammatically correct version of the sequence of words, output, for display at a display device, at least a portion of the grammatically correct version of the sequence of words as a suggested replacement for at least a sequence of the sequence of words in the inputted text.

9.

发明公开
Knowledge Distillation with Domain Mismatch For Speech Recognition 审中-公开

公开(公告)号：US20240233707A9

公开(公告)日：2024-07-11

申请号：US18488578

申请日：2023-10-17

Applicant: Google LLC

Inventor： Tien-Ju Yang , You-Chi Cheng , Shankar Kumar , Jared Lichtarge , Ehsan Amid , Yuxin Ding , Rajiv Mathews , Mingqing Chen

IPC: G10L15/06 , G10L15/197 , G10L15/30

CPC classification number: G10L15/063 , G10L15/197 , G10L15/30 , G10L2015/0635

Abstract: A method includes receiving distillation data including a plurality of out-of-domain training utterances. For each particular out-of-domain training utterance of the distillation data, the method includes generating a corresponding augmented out-of-domain training utterance, and generating, using a teacher ASR model trained on training data corresponding to a target domain, a pseudo-label corresponding to the corresponding augmented out-of-domain training utterance. The method also includes distilling a student ASR model from the teacher ASR model by training the student ASR model using the corresponding augmented out-of-domain training utterances paired with the corresponding pseudo-labels generated by the teacher ASR model.

10.

发明公开
Knowledge Distillation with Domain Mismatch For Speech Recognition 审中-公开

公开(公告)号：US20240135918A1

公开(公告)日：2024-04-25

申请号：US18488578

申请日：2023-10-16

Applicant: Google LLC

Inventor： Tien-Ju Yang , You-Chi Cheng , Shankar Kumar , Jared Lichtarge , Ehsan Amid , Yuxin Ding , Rajiv Mathews , Mingqing Chen

IPC: G10L15/06 , G10L15/197 , G10L15/30

CPC classification number: G10L15/063 , G10L15/197 , G10L15/30 , G10L2015/0635

Abstract: A method includes receiving distillation data including a plurality of out-of-domain training utterances. For each particular out-of-domain training utterance of the distillation data, the method includes generating a corresponding augmented out-of-domain training utterance, and generating, using a teacher ASR model trained on training data corresponding to a target domain, a pseudo-label corresponding to the corresponding augmented out-of-domain training utterance. The method also includes distilling a student ASR model from the teacher ASR model by training the student ASR model using the corresponding augmented out-of-domain training utterances paired with the corresponding pseudo-labels generated by the teacher ASR model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification