Invention Grant
- Patent Title: Audio-speech driven animated talking face generation using a cascaded generative adversarial network
-
Application No.: US17199149Application Date: 2021-03-11
-
Publication No.: US11551394B2Publication Date: 2023-01-10
- Inventor: Sandika Biswas , Dipanjan Das , Sanjana Sinha , Brojeshwar Bhowmick
- Applicant: Tata Consultancy Services Limited
- Applicant Address: IN Mumbai
- Assignee: Tata Consultancy Services Limited
- Current Assignee: Tata Consultancy Services Limited
- Current Assignee Address: IN Mumbai
- Agency: Finnegan, Henderson, Farabow, Garrett & Dunner LLP
- Priority: IN202021032794 20200730
- Main IPC: G06T13/20
- IPC: G06T13/20 ; G06V40/16 ; G06K9/62 ; G06N3/04 ; G06N3/08 ; G10L15/02

Abstract:
Conventional state-of-the-art methods are limited in their ability to generate realistic animation from audio on any unknown faces and cannot be easily generalized to different facial characteristics and voice accents. Further, these methods fail to produce realistic facial animation for subjects which are quite different than that of distribution of facial characteristics network has seen during training. Embodiments of the present disclosure provide systems and methods that generate audio-speech driven animated talking face using a cascaded generative adversarial network (CGAN), wherein a first GAN is used to transfer lip motion from canonical face to person-specific face. A second GAN based texture generator network is conditioned on person-specific landmark to generate high-fidelity face corresponding to the motion. Texture generator GAN is made more flexible using meta learning to adapt to unknown subject's traits and orientation of face during inference. Finally, eye-blinks are induced in the final animation face being generated.
Public/Granted literature
- US20220036617A1 AUDIO-SPEECH DRIVEN ANIMATED TALKING FACE GENERATION USING A CASCADED GENERATIVE ADVERSARIAL NETWORK Public/Granted day:2022-02-03
Information query