METHODS AND SYSTEMS FOR EMOTION-CONTROLLABLE GENERALIZED TALKING FACE GENERATION

发明公开

EP4270391A1 METHODS AND SYSTEMS FOR EMOTION-CONTROLLABLE GENERALIZED TALKING FACE GENERATION 审中-公开

请登陆查看更多内容

专利标题： METHODS AND SYSTEMS FOR EMOTION-CONTROLLABLE GENERALIZED TALKING FACE GENERATION
申请号： EP23154135.0

申请日： 2023-01-31
公开(公告)号： EP4270391A1

公开(公告)日： 2023-11-01
发明人: SINHA, SANJANA , BISWAS, SANDIKA , BHOWMICK, BROJESHWAR
申请人： Tata Consultancy Services Limited
申请人地址： IN Maharashtra Nirmal Building 9th Floor Nariman Point Mumbai 400 021
代理机构： Goddar, Heinz J.
优先权： IN202221025055 20220428
主分类号： G10L21/00
IPC分类号： G10L21/00 ; G06T13/40

摘要：

This disclosure relates generally to methods and systems for emotion-controllable generalized talking face generation of an arbitrary face image. Most of the conventional techniques for the realistic talking face generation may not be efficient to control the emotion over the face and have limited scope of generalization to an arbitrary unknown target face. The present disclosure proposes a graph convolutional network that uses speech content feature along with an independent emotion input to generate emotion and speech-induced motion on facial geometry-aware landmark representation. The facial geometry-aware landmark representation is further used in by an optical flow-guided texture generation network for producing the texture. A two-branch optical flow-guided texture generation network with motion and texture branches is designed to consider the motion and texture content independently. The optical flow-guided texture generation network then renders emotional talking face animation from a single image of any arbitrary target face.

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L21/00	为了改变语音或声音信号的质量或其可识度而处理语音或声音信号，以产生另一种可听的或非可听的信号，例如视觉信号或触觉信号（G10L19/00优先）