Invention Application
- Patent Title: TARGETED VOICE SEPARATION BY SPEAKER FOR SPEECH RECOGNITION
-
Application No.: US17619648Application Date: 2019-10-10
-
Publication No.: US20220301573A1Publication Date: 2022-09-22
- Inventor: Quan Wang , Ignacio Lopez Moreno , Li Wan
- Applicant: GOOGLE LLC
- Applicant Address: US CA Mountain View
- Assignee: GOOGLE LLC
- Current Assignee: GOOGLE LLC
- Current Assignee Address: US CA Mountain View
- International Application: PCT/US2019/055539 WO 20191010
- Main IPC: G10L21/028
- IPC: G10L21/028 ; G10L17/04 ; G10L17/18 ; G10L17/02 ; G10L21/0232

Abstract:
Processing of acoustic features of audio data to generate one or more revised versions of the acoustic features, where each of the revised versions of the acoustic features isolates one or more utterances of a single respective human speaker. Various implementations generate the acoustic features by processing audio data using portion(s) of an automatic speech recognition system. Various implementations generate the revised acoustic features by processing the acoustic features using a mask generated by processing the acoustic features and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using the automatic speech recognition system to generate a predicted text representation of the utterance(s) of the single human speaker without reconstructing the audio data.
Public/Granted literature
- US12254891B2 Targeted voice separation by speaker for speech recognition Public/Granted day:2025-03-18
Information query