VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Invention Publication

US20230169984A1 VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION 审中-公开

Please log in to see more content

Patent Title: VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION
Application No.: US18103324

Application Date: 2023-01-30
Publication No.: US20230169984A1

Publication Date: 2023-06-01
Inventor: Rajeev Rikhye , Quan Wang , Yanzhang He , Qiao Liang , Ian C. McGraw
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Main IPC: G10L17/24
IPC: G10L17/24 ; G10L17/06 ; G10L21/028

VOICE SHORTCUT DETECTION WITH SPEAKER VERIFICATION

Abstract:

Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance. Additionally or alternatively, the text representation of the utterance can be processed to determine whether at least a portion of the text representation of the utterance captures a particular keyphrase. When the system determines the registered and/or verified user spoke the utterance and the system determines the text representation of the utterance captures the particular keyphrase, the system can cause a computing device to perform one or more actions corresponding to the particular keyphrase.

Public/Granted literature

US12033641B2 Voice shortcut detection with speaker verification Public/Granted day:2024-07-09

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L17/00	讲话者辨认或验证
G10L17/22	.交互程序，人-机界面
G10L17/24	..提示用户发出密码或预先确定的文字