Text and audio-based real-time face reenactment

Invention Grant

US11741940B2 Text and audio-based real-time face reenactment 有权

Please log in to see more content

Patent Title: Text and audio-based real-time face reenactment
Application No.: US17355834

Application Date: 2021-06-23
Publication No.: US11741940B2

Publication Date: 2023-08-29
Inventor: Pavel Savchenkov , Maxim Lukin , Aleksandr Mashrabov
Applicant: Snap Inc.
Applicant Address: US CA Santa Monica
Assignee: Snap Inc.
Current Assignee: Snap Inc.
Current Assignee Address: US CA Santa Monica
Agent Georgiy L. Khayet
Main IPC: G10L13/00
IPC: G10L13/00 ; G10L13/08 ; G06T13/40 ; G06V40/16 ; G06V10/764 ; G06V10/82

Abstract:

Provided are systems and methods for text and audio-based real-time face reenactment. An example method includes receiving an input text and a target image, the target image including a target face; generating, based on the input text, a sequence of sets of acoustic features representing the input text; generating, based on the sequence of sets of acoustic features, a sequence of sets of mouth key points; generating, based on the sequence of sets of mouth key points, a sequence of sets of facial key points; generating, by the computing device and based on the sequence of sets of the facial key points and the target image, a sequence of frames; and generating, based on the sequence of frames, an output video. Each of the frames includes the target face modified based on at least one set of mouth key points of the sequence of sets of mouth key points.

Public/Granted literature

US20210327404A1 TEXT AND AUDIO-BASED REAL-TIME FACE REENACTMENT Public/Granted day:2021-10-21

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统