Invention Grant
- Patent Title: Synthesizing speech from text using neural networks
-
Application No.: US16058640Application Date: 2018-08-08
-
Publication No.: US10971170B2Publication Date: 2021-04-06
- Inventor: Yonghui Wu , Jonathan Shen , Ruoming Pang , Ron J. Weiss , Michael Schuster , Navdeep Jaitly , Zongheng Yang , Zhifeng Chen , Yu Zhang , Yuxuan Wang , Russell John Wyatt Skerry-Ryan , Ryan M. Rifkin , Ioannis Agiomyrgiannakis
- Applicant: Google LLC
- Applicant Address: US CA Mountain View
- Assignee: Google LLC
- Current Assignee: Google LLC
- Current Assignee Address: US CA Mountain View
- Agency: Fish & Richardson P.C.
- Main IPC: G10L25/30
- IPC: G10L25/30 ; G10L13/047 ; G10L13/08 ; G06N7/00 ; G06N3/08 ; G06N3/04 ; G06N5/04 ; G10L25/18

Abstract:
Methods, systems, and computer program products for generating, from an input character sequence, an output sequence of audio data representing the input character sequence. The output sequence of audio data includes a respective audio output sample for each of a number of time steps. One example method includes, for each of the time steps: generating a mel-frequency spectrogram for the time step by processing a representation of a respective portion of the input character sequence using a decoder neural network; generating a probability distribution over a plurality of possible audio output samples for the time step by processing the mel-frequency spectrogram for the time step using a vocoder neural network; and selecting the audio output sample for the time step from the possible audio output samples in accordance with the probability distribution.
Public/Granted literature
- US20200051583A1 SYNTHESIZING SPEECH FROM TEXT USING NEURAL NETWORKS Public/Granted day:2020-02-13
Information query