Invention Grant
- Patent Title: Unsupervised alignment for text to speech synthesis using neural networks
-
Application No.: US17496569Application Date: 2021-10-07
-
Publication No.: US11769481B2Publication Date: 2023-09-26
- Inventor: Kevin Shih , Jose Rafael Valle Gomes da Costa , Rohan Badlani , Adrian Lancucki , Wei Ping , Bryan Catanzaro
- Applicant: Nvidia Corporation
- Applicant Address: US CA Santa Clara
- Assignee: Nvidia Corporation
- Current Assignee: Nvidia Corporation
- Current Assignee Address: US CA Santa Clara
- Agency: Hogan Lovells US LLP
- Main IPC: G10L13/00
- IPC: G10L13/00 ; G10L13/10 ; G10L13/06 ; G10L13/07 ; G10L13/047 ; G10L25/90 ; G06N3/045 ; G06N3/08 ; G10L13/033 ; G10L13/08

Abstract:
Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.
Public/Granted literature
- US20230113950A1 UNSUPERVISED ALIGNMENT FOR TEXT TO SPEECH SYNTHESIS USING NEURAL NETWORKS Public/Granted day:2023-04-13
Information query