AUDIO-DRIVEN FACIAL ANIMATION USING MACHINE LEARNING

    公开(公告)号:US20250061634A1

    公开(公告)日:2025-02-20

    申请号:US18457251

    申请日:2023-08-28

    Abstract: Systems and methods of the present disclosure include animating virtual avatars or agents according to input audio and one or more selected or determined emotions and/or styles. For example, a deep neural network can be trained to output motion or deformation information for a character that is representative of the character uttering speech contained in audio input. The character can have different facial components or regions (e.g., head, skin, eyes, tongue) modeled separately, such that the network can output motion or deformation information for each of these different facial components. During training, the network can use a transformer-based audio encoder with locked parameters to train an associated decoder using a weighted feature vector. The network output can be provided to a renderer to generate audio-driven facial animation that is emotion-accurate.

Patent Agency Ranking