TARGETED VOICE SEPARATION BY SPEAKER FOR SPEECH RECOGNITION

Invention Application

US20220301573A1 TARGETED VOICE SEPARATION BY SPEAKER FOR SPEECH RECOGNITION 有权

Please log in to see more content

Patent Title: TARGETED VOICE SEPARATION BY SPEAKER FOR SPEECH RECOGNITION
Application No.: US17619648

Application Date: 2019-10-10
Publication No.: US20220301573A1

Publication Date: 2022-09-22
Inventor: Quan Wang , Ignacio Lopez Moreno , Li Wan
Applicant: GOOGLE LLC
Applicant Address: US CA Mountain View
Assignee: GOOGLE LLC
Current Assignee: GOOGLE LLC
Current Assignee Address: US CA Mountain View
International Application: PCT/US2019/055539 WO 20191010
Main IPC: G10L21/028
IPC: G10L21/028 ; G10L17/04 ; G10L17/18 ; G10L17/02 ; G10L21/0232

TARGETED VOICE SEPARATION BY SPEAKER FOR SPEECH RECOGNITION

Abstract:

Processing of acoustic features of audio data to generate one or more revised versions of the acoustic features, where each of the revised versions of the acoustic features isolates one or more utterances of a single respective human speaker. Various implementations generate the acoustic features by processing audio data using portion(s) of an automatic speech recognition system. Various implementations generate the revised acoustic features by processing the acoustic features using a mask generated by processing the acoustic features and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using the automatic speech recognition system to generate a predicted text representation of the utterance(s) of the single human speaker without reconstructing the audio data.

Public/Granted literature

US12254891B2 Targeted voice separation by speaker for speech recognition Public/Granted day:2025-03-18

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L21/00	为了改变语音或声音信号的质量或其可识度而处理语音或声音信号，以产生另一种可听的或非可听的信号，例如视觉信号或触觉信号（G10L19/00优先）
G10L21/02	.语音增强，例如降低噪声或消除回声（在直线传送系统中减轻回声效应入H04B3/20；免提电话中的回声抑制入H04M9/08）
G10L21/0272	..声音信号的分离
G10L21/028	...采用声源的属性