Patent search ap:("GOOGLE LLC") AND inv:"Trevor Strohman" Page 3

21.

发明公开
DYNAMIC SELECTION FROM AMONG MULTIPLE CANDIDATE GENERATIVE MODELS WITH DIFFERING COMPUTATIONAL EFFICIENCIES 审中-公开

公开(公告)号：US20240311405A1

公开(公告)日：2024-09-19

申请号：US18337316

申请日：2023-06-19

Applicant: GOOGLE LLC

Inventor： Seungyeon Kim , Ankit Singh Rawat , Wittawat Jitkrittum , Hari Narasimhan , Sashank Reddi , Neha Gupta , Srinadh Bhojanapalli , Aditya Menon , Manzil Zaheer , Tal Schuster , Sanjiv Kumar , Toby Boyd , Zhifeng Chen , Emanuel Taropa , Vikram Kasivajhula , Trevor Strohman , Martin Baeuml , Leif Schelin , Yanping Huang

IPC: G06F16/332

CPC classification number: G06F16/3329

Abstract: Implementations disclose selecting, in response to receiving a request and from among multiple candidate generative models (e.g., multiple candidate large language models (LLMs)) with differing computational efficiencies, a particular generative model to utilize in generating a response to the request. Those implementations reduce latency and/or conserve computational resource(s) through selection, for various requests, of a more computationally efficient generative model for utilization in lieu of a less computationally efficient generative model. Further, those implementations seek to achieve such benefits, through utilization of more computationally efficient generative models, while also still selectively utilizing less computationally efficient generative models for certain requests to mitigate occurrences of a generated response being inaccurate and/or under-specified. This, in turn, can mitigate occurrences of computational and/or network inefficiencies that result from a user issuing a follow-up request to cure the inaccuracies and/or under-specification of a generated response.

22.

发明公开
PARAMETER-EFFICIENT MODEL REPROGRAMMING FOR CROSS-LINGUAL SPEECH RECOGNITION 审中-公开

公开(公告)号：US20240185841A1

公开(公告)日：2024-06-06

申请号：US18490808

申请日：2023-10-20

Applicant: Google LLC

Inventor： Bo Li , Yu Zhang , Nanxin Chen , Rohit Prakash Prabhavalkar , Chao-Han Huck Yang , Tara N. Sainath , Trevor Strohman

IPC: G10L15/065 , G10L15/00

CPC classification number: G10L15/065 , G10L15/005

Abstract: A method includes obtaining an ASR model trained to recognize speech in a first language and receiving transcribed training utterances in a second language. The method also includes integrating the ASR model with an input reprogramming module and a latent reprogramming module. The method also includes adapting the ASR model to learn how to recognize speech in the second language by training the input reprogramming module and the latent reprogramming module while parameters of the ASR model are frozen.

23.

发明公开
Flickering Reduction with Partial Hypothesis Re-ranking for Streaming ASR 审中-公开

公开(公告)号：US20240029718A1

公开(公告)日：2024-01-25

申请号：US18352211

申请日：2023-07-13

Applicant: Google LLC

Inventor： Antoine Jean Bruguier , David Qiu , Yangzhang He , Trevor Strohman

IPC: G10L15/10 , G10L15/26

CPC classification number: G10L15/10 , G10L15/26

Abstract: A method includes processing, using a speech recognizer, a first portion of audio data to generate a first lattice, and generating a first partial transcription for an utterance based on the first lattice. The method includes processing, using the recognizer, a second portion of the data to generate, based on the first lattice, a second lattice representing a plurality of partial speech recognition hypotheses for the utterance and a plurality of corresponding speech recognition scores. For each particular partial speech recognition hypothesis, the method includes generating a corresponding re-ranked score based on the corresponding speech recognition score and whether the particular partial speech recognition hypothesis shares a prefix with the first partial transcription. The method includes generating a second partial transcription for the utterance by selecting the partial speech recognition hypothesis of the second plurality of partial speech recognition hypotheses having the highest corresponding re-ranked score.

24.

发明公开
Attention-Based Joint Acoustic and Text On-Device End-to-End Model 审中-公开

公开(公告)号：US20230186901A1

公开(公告)日：2023-06-15

申请号：US18167454

申请日：2023-02-10

Applicant: Google LLC

Inventor： Tara N. Sainath , Ruoming Pang , Ron Weiss , Yanzhang He , Chung-Cheng Chiu , Trevor Strohman

IPC: G10L15/06 , G06N3/08 , G10L15/16 , G10L15/197

CPC classification number: G10L15/063 , G06N3/08 , G10L15/16 , G10L15/197 , G10L2015/0635

Abstract: A method includes receiving a training example for a listen-attend-spell (LAS) decoder of a two-pass streaming neural network model and determining whether the training example corresponds to a supervised audio-text pair or an unpaired text sequence. When the training example corresponds to an unpaired text sequence, the method also includes determining a cross entropy loss based on a log probability associated with a context vector of the training example. The method also includes updating the LAS decoder and the context vector based on the determined cross entropy loss.

25.

发明申请
Optimizing Inference Performance for Conformer 有权

公开(公告)号：US20230130634A1

公开(公告)日：2023-04-27

申请号：US17936547

申请日：2022-09-29

Applicant: Google LLC

Inventor： Tara N. Sainath , Rami Botros , Anmol Gulati , Krzysztof Choromanski , Ruoming Pang , Trevor Strohman , Weiran Wang , Jiahui Yu

IPC: G10L15/16 , G10L15/22 , G10L15/06

Abstract: A computer-implemented method includes receiving a sequence of acoustic frames as input to an automatic speech recognition (ASR) model. Here, the ASR model includes a causal encoder and a decoder. The method also includes generating, by the causal encoder, a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The method also includes generating, by the decoder, a first probability distribution over possible speech recognition hypotheses. Here, the causal encoder includes a stack of causal encoder layers each including a Recurrent Neural Network (RNN) Attention-Performer module that applies linear attention.

26.

发明申请
ENABLING NATURAL CONVERSATIONS WITH SOFT ENDPOINTING FOR AN AUTOMATED ASSISTANT 有权

公开(公告)号：US20230053341A1

公开(公告)日：2023-02-23

申请号：US17532819

申请日：2021-11-22

Applicant: GOOGLE LLC

Inventor： Jaclyn Konzelmann , Trevor Strohman , Jonathan Bloom , Johan Schalkwyk , Joseph Smarr

IPC: G10L15/22 , G10L15/18 , G08B5/36 , G06N20/00

Abstract: As part of a dialog session between a user and an automated assistant, implementations can process, using a streaming ASR model, a stream of audio data that captures a portion of a spoken utterance to generate ASR output, process, using an NLU model, the ASR output to generate NLU output, and cause, based on the NLU output, a stream of fulfillment data to be generated. Further, implementations can further determine, based on processing the stream of audio data, audio-based characteristics associated with the portion of the spoken utterance captured in the stream of audio data. Based on the audio-based characteristics and/the stream of NLU output, implementations can determine whether the user has paused in providing the spoken utterance or has completed providing of the spoken utterance. If the user has paused, implementations can cause natural conversation output to be provided for presentation to the user.

27.

发明授权
Emitting word timings with end-to-end models 有权

公开(公告)号：US11580956B2

公开(公告)日：2023-02-14

申请号：US17204852

申请日：2021-03-17

Applicant: Google LLC

Inventor： Tara N. Sainath , Basi Garcia , David Rybach , Trevor Strohman , Ruoming Pang

IPC: G10L25/30 , G10L15/06 , G10L25/78

Abstract: A method includes receiving a training example that includes audio data representing a spoken utterance and a ground truth transcription. For each word in the spoken utterance, the method also includes inserting a placeholder symbol before the respective word identifying a respective ground truth alignment for a beginning and an end of the respective word, determining a beginning word piece and an ending word piece, and generating a first constrained alignment for the beginning word piece and a second constrained alignment for the ending word piece. The first constrained alignment is aligned with the ground truth alignment for the beginning of the respective word and the second constrained alignment is aligned with the ground truth alignment for the ending of the respective word. The method also includes constraining an attention head of a second pass decoder by applying the first and second constrained alignments.

28.

发明申请
Attention-Based Joint Acoustic and Text On-Device End-to-End Model 有权

公开(公告)号：US20210225362A1

公开(公告)日：2021-07-22

申请号：US17155010

申请日：2021-01-21

Applicant: Google LLC

Inventor： Tara N. Sainath , Ruorning Pang , Ron Weiss , Yanzhang He , Chung-Cheng Chiu , Trevor Strohman

IPC: G10L15/06 , G10L15/16 , G10L15/197 , G06N3/08

Abstract: A method includes receiving a training example for a listen-attend-spell (LAS) decoder of a two-pass streaming neural network model and determining whether the training example corresponds to a supervised audio-text pair or an unpaired text sequence. When the training example corresponds to an unpaired text sequence, the method also includes determining a cross entropy loss based on a log probability associated with a context vector of the training example. The method also includes updating the LAS decoder and the context vector based on the determined cross entropy loss.

29.

发明申请
Disfluency Detection Models for Natural Conversational Voice Systems 有权

公开(公告)号：US20250140239A1

公开(公告)日：2025-05-01

申请号：US19010299

申请日：2025-01-06

Applicant: Google LLC

Inventor： Shuo-yiin Chang , Bo Li , Tara N. Sainath , Trevor Strohman , Chao Zhang

IPC: G10L15/06 , G10L15/08

Abstract: A method includes receiving a sequence of acoustic frames characterizing one or more utterances. At each of a plurality of output steps, the method also includes generating, by an encoder network of a speech recognition model, a higher order feature representation for a corresponding acoustic frame of the sequence of acoustic frames, generating, by a prediction network of the speech recognition model, a hidden representation for a corresponding sequence of non-blank symbols output by a final softmax layer of the speech recognition model, and generating, by a first joint network of the speech recognition model that receives the higher order feature representation generated by the encoder network and the dense representation generated by the prediction network, a probability distribution that the corresponding time step corresponds to a pause and an end of speech.

30.

发明申请
Adapter Finetuning with Teacher Pseudo-Labeling for Tail Languages in Streaming Multilingual ASR 有权

公开(公告)号：US20250078830A1

公开(公告)日：2025-03-06

申请号：US18826743

申请日：2024-09-06

Applicant: Google LLC

Inventor： Junwen Bai , Bo Li , Qiujia Li , Tara N. Sainath , Trevor Strohman

IPC: G10L15/197 , G10L15/00 , G10L15/02 , G10L15/06 , G10L15/30

Abstract: A method includes receiving a sequence of acoustic frames characterizing a spoken utterance in a particular native language. The method also includes generating a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames by a causal encoder that includes an initial stack of multi-head attention layers. The method also includes generating a second higher order feature representation for a corresponding first higher order feature representation by a non-causal encoder that includes a final stack of multi-head attention layers. The method also includes receiving, as input at each corresponding language-dependent adapter (LDA) module, a language ID vector identifying the particular native language to activate corresponding language-dependent weights specific to the particular native language. The method also includes generating a first probability distribution over possible speech recognition hypotheses by a decoder.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification