Unsupervised alignment for text to speech synthesis using neural networks

Invention Grant

US11769481B2 Unsupervised alignment for text to speech synthesis using neural networks 有权

Please log in to see more content

Patent Title: Unsupervised alignment for text to speech synthesis using neural networks
Application No.: US17496569

Application Date: 2021-10-07
Publication No.: US11769481B2

Publication Date: 2023-09-26
Inventor: Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro
Applicant: Nvidia Corporation
Applicant Address: US CA Santa Clara
Assignee: Nvidia Corporation
Current Assignee: Nvidia Corporation
Current Assignee Address: US CA Santa Clara
Agency: Hogan Lovells US LLP
Main IPC: G10L13/00
IPC: G10L13/00 ; G10L13/10 ; G10L13/06 ; G10L13/07 ; G10L13/047 ; G10L25/90 ; G06N3/045 ; G06N3/08 ; G10L13/033 ; G10L13/08

Unsupervised alignment for text to speech synthesis using neural networks

Abstract:

Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

Public/Granted literature

US20230113950A1 UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS Public/Granted day:2023-04-13

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统