USING MACHINE-LEARNING MODELS TO DETERMINE MOVEMENTS OF A MOUTH CORRESPONDING TO LIVE SPEECH

发明申请

US20190392823A1 USING MACHINE-LEARNING MODELS TO DETERMINE MOVEMENTS OF A MOUTH CORRESPONDING TO LIVE SPEECH 审中-公开

请登陆查看更多内容

专利标题： USING MACHINE-LEARNING MODELS TO DETERMINE MOVEMENTS OF A MOUTH CORRESPONDING TO LIVE SPEECH
申请号： US16016418

申请日： 2018-06-22
公开(公告)号： US20190392823A1

公开(公告)日： 2019-12-26
发明人: Wilmot Li , Jovan Popovic , Deepali Aneja , David Simons
申请人： Adobe Inc.
主分类号： G10L15/197
IPC分类号： G10L15/197 ; G06N3/08 ; G06N3/04 ; G10L15/06 ; G10L15/02 ; G10L25/24 ; G10L25/21 ; G10L21/0316

USING MACHINE-LEARNING MODELS TO DETERMINE MOVEMENTS OF A MOUTH CORRESPONDING TO LIVE SPEECH

摘要：

Disclosed systems and methods predict visemes from an audio sequence. A viseme-generation application accesses a first set of training data that includes a first audio sequence representing a sentence spoken by a first speaker and a sequence of visemes. Each viseme is mapped to a respective audio sample of the first audio sequence. The viseme-generation application creates a second set of training data adjusting a second audio sequence spoken by a second speaker speaking the sentence such that the second and first sequences have the same length and at least one phoneme occurs at the same time stamp in the first sequence and in the second sequence. The viseme-generation application maps the sequence of visemes to the second audio sequence and trains a viseme prediction model to predict a sequence of visemes from an audio sequence.

公开/授权文献

US10699705B2 Using machine-learning models to determine movements of a mouth corresponding to live speech 公开/授权日：2020-06-30

信息查询

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/08	.语音分类或检索
G10L15/18	..利用自然语言模型
G10L15/183	...用上下文相关性，例如：语言模型
G10L15/19	....语法上下文，例如：基于字母顺序规则的识别假定的消除二义性
G10L15/197	.....概率文法，例如：字元语法