Method and system for generating 2D animated lip images synchronizing to an audio signal
Abstract:
A method and system for generating 2D animated lip images synchronizing to an audio signal for an unseen subject. The system receives an audio signal and a target lip image of an unseen target subject as inputs from a user and processes these inputs to extract a plurality of high dimensional audio image features. The lip generator system is meta-trained with training dataset which consists of large variety of subjects' ethnicity and vocabulary. The meta-trained model generates realistic animation for previously unseen face and unseen audio when finetuned with only a few-shot samples for a predefined interval of time. Additionally, the method protects intrinsic features of the unseen target subject.
Information query
Patent Agency Ranking
0/0