Patent search ap:("GOOGLE LLC") AND inv:"Ian C. McGraw" Page 1

1.

发明授权
Voice shortcut detection with speaker verification 有权

公开(公告)号：US11568878B2

公开(公告)日：2023-01-31

申请号：US17233253

申请日：2021-04-16

Applicant: Google LLC

Inventor： Rajeev Rikhye , Quan Wang , Yanzhang He , Qiao Liang , Ian C. McGraw

IPC: G10L17/24 , G10L17/06 , G10L21/028

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance. Additionally or alternatively, the text representation of the utterance can be processed to determine whether at least a portion of the text representation of the utterance captures a particular keyphrase. When the system determines the registered and/or verified user spoke the utterance and the system determines the text representation of the utterance captures the particular keyphrase, the system can cause a computing device to perform one or more actions corresponding to the particular keyphrase.

2.

发明授权
Key phrase spotting 有权

公开(公告)号：US11295739B2

公开(公告)日：2022-04-05

申请号：US16527487

申请日：2019-07-31

Applicant: Google LLC

Inventor： Wei Li , Rohit Prakash Prabhavalkar , Kanury Kanishka Rao , Yanzhang He , Ian C. McGraw , Anton Bakhtin

IPC: G10L15/06 , G10L15/16 , G10L15/22 , G10L15/18 , G10L19/00 , G10L15/02 , G10L15/08 , G10L15/14

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.

3.

发明申请
KEY PHRASE SPOTTING 审中-公开

公开(公告)号：US20200066271A1

公开(公告)日：2020-02-27

申请号：US16527487

申请日：2019-07-31

Applicant: Google LLC

Inventor： Wei Li , Rohit Prakash Prabhavalkar , Kanury Kanishka Rao , Yanzhang He , Ian C. McGraw , Anton Bakhtin

IPC: G10L15/22 , G10L15/06 , G10L15/18 , G10L19/00 , G10L15/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.

4.

发明申请
TWO-PASS END TO END SPEECH RECOGNITION 有权

公开(公告)号：US20220310072A1

公开(公告)日：2022-09-29

申请号：US17616129

申请日：2020-06-03

Applicant: GOOGLE LLC

Inventor： Tara N. Sainath , Ruoming Pang , David Rybach , Yanzhang He , Rohit Prabhavalkar , Wei Li , Mirkó Visontai , Qiao Liang , Trevor Strohman , Yonghui Wu , Ian C. McGraw , Chung-Cheng Chiu

IPC: G10L15/16 , G10L15/32 , G10L15/05

Abstract: Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.

5.

发明申请
KEY PHRASE SPOTTING 有权

公开(公告)号：US20220199084A1

公开(公告)日：2022-06-23

申请号：US17654195

申请日：2022-03-09

Applicant: Google LLC

Inventor： Wei Li , Rohit Prakash Prabhavalkar , Kanury Kanishka Rao , Yanzhang He , Ian C. McGraw , Anton Bakhtin

IPC: G10L15/22 , G10L15/06 , G10L15/18 , G10L19/00 , G10L15/02

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers, generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.

6.

发明授权
Selecting alternates in speech recognition 有权

公开(公告)号：US10140978B2

公开(公告)日：2018-11-27

申请号：US15703033

申请日：2017-09-13

Applicant: Google LLC

Inventor： Alexander H. Gruenstein , Dave Harwath , Ian C. McGraw

IPC: G10L15/00 , G10L15/08 , G10L15/04 , G10L15/22

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting alternates in speech recognition. In some implementations, data is received that indicates multiple speech recognition hypotheses for an utterance. Based on the multiple speech recognition hypotheses, multiple alternates for a particular portion of a transcription of the utterance are identified. For each of the identified alternates, one or more features scores are determined, the features scores are input to a trained classifier, and an output is received from the classifier. A subset of the identified alternates is selected, based on the classifier outputs, to provide for display. Data indicating the selected subset of the alternates is provided for display.

7.

发明授权
Automatic hotword threshold tuning 有权

公开(公告)号：US12277928B2

公开(公告)日：2025-04-15

申请号：US18181895

申请日：2023-03-10

Applicant: Google LLC

Inventor： Aishanee Shah , Alexander H. Gruenstein , Ian C. McGraw

IPC: G10L15/065 , G10L15/22

Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.

8.

发明公开
VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION 审中-公开

公开(公告)号：US20240363122A1

公开(公告)日：2024-10-31

申请号：US18765108

申请日：2024-07-05

Applicant: GOOGLE LLC

Inventor： Rajeev Rikhye , Quan Wang , Yanzhang He , Qiao Liang , Ian C. McGraw

IPC: G10L17/24 , G10L15/26 , G10L17/06 , G10L21/028

CPC classification number: G10L17/24 , G10L15/26 , G10L17/06 , G10L21/028

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance. Additionally or alternatively, the text representation of the utterance can be processed to determine whether at least a portion of the text representation of the utterance captures a particular keyphrase. When the system determines the registered and/or verified user spoke the utterance and the system determines the text representation of the utterance captures the particular keyphrase, the system can cause a computing device to perform one or more actions corresponding to the particular keyphrase.

9.

发明授权
Voice shortcut detection with speaker verification 有权

公开(公告)号：US12033641B2

公开(公告)日：2024-07-09

申请号：US18103324

申请日：2023-01-30

Applicant: Google LLC

Inventor： Rajeev Rikhye , Quan Wang , Yanzhang He , Qiao Liang , Ian C. McGraw

IPC: G10L17/24 , G10L15/26 , G10L17/06 , G10L21/028

CPC classification number: G10L17/24 , G10L15/26 , G10L17/06 , G10L21/028

Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance. Additionally or alternatively, the text representation of the utterance can be processed to determine whether at least a portion of the text representation of the utterance captures a particular keyphrase. When the system determines the registered and/or verified user spoke the utterance and the system determines the text representation of the utterance captures the particular keyphrase, the system can cause a computing device to perform one or more actions corresponding to the particular keyphrase.

10.

发明公开
Automatic Hotword Threshold Tuning 审中-公开

公开(公告)号：US20230206908A1

公开(公告)日：2023-06-29

申请号：US18181895

申请日：2023-03-10

Applicant: Google LLC

Inventor： Aishanee Shah , Alexander H. Gruenstein , Ian C. McGraw

IPC: G10L15/065 , G10L15/22

CPC classification number: G10L15/065 , G10L15/22 , G10L2015/223

Abstract: A method for automatic hotword threshold tuning includes receiving, from a user device executing a first stage hotword detector configured to detect a hotword in streaming audio, audio data characterizing the detected hotword. The method includes processing, using a second stage hotword detector, the audio data to determine whether the hotword is detected by the second stage hotword detector. When the hotword is not detected, the method includes identifying a false acceptance instance at the first stage hotword detector indicating that the first stage hotword detector incorrectly detected the hotword. The method includes determining whether a false acceptance rate satisfies a false acceptance rate threshold based on a number of false acceptance instances within a false acceptance time period. When the false acceptance rate satisfies the false acceptance rate threshold, the method includes adjusting the hotword detection threshold of the first stage hotword detector.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification