Artificial intelligence-based text-to-speech system and method

发明授权

US11244669B2 Artificial intelligence-based text-to-speech system and method 有权

请登陆查看更多内容

专利标题： Artificial intelligence-based text-to-speech system and method
申请号： US16446833

申请日： 2019-06-20
公开(公告)号： US11244669B2

公开(公告)日： 2022-02-08
发明人: Martin Reber , Vijeta Avijeet
申请人： Telepathy Labs, Inc.
申请人地址： US FL Tampa
专利权人： Telepathy Labs, Inc.
当前专利权人： Telepathy Labs, Inc.
当前专利权人地址： US FL Tampa
代理机构： Holland & Knight LLP
代理商 Michael T. Abramson
主分类号： G10L25/30
IPC分类号： G10L25/30 ; G10L13/08 ; G06K9/62 ; G06N5/02 ; G06N3/02 ; G10L19/00 ; G10L13/04 ; G06N3/04 ; G06N3/08

Artificial intelligence-based text-to-speech system and method

摘要：

A technique improves training and speech quality of a text-to-speech (TTS) system having an artificial intelligence, such as a neural network. The TTS system is organized as a front-end subsystem and a back-end subsystem. The front-end subsystem is configured to provide analysis and conversion of text into input vectors, each having at least a base frequency, f0, a phenome duration, and a phoneme sequence that is processed by a signal generation unit of the back-end subsystem. The signal generation unit includes the neural network interacting with a pre-existing knowledgebase of phenomes to generate audible speech from the input vectors. The technique applies an error signal from the neural network to correct imperfections of the pre-existing knowledgebase of phenomes to generate audible speech signals. A back-end training system is configured to train the signal generation unit by applying psychoacoustic principles to improve quality of the generated audible speech signals.

公开/授权文献

US20190304434A1 ARTIFICIAL INTELLIGENCE-BASED TEXT-TO-SPEECH SYSTEM AND METHOD 公开/授权日：2019-10-03

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L25/00	不限于组G10L 15/00-G10L 21/00的语言或者声音分析技术(当利用语音检测器来感知一些信号特殊特征的基于半导体的静噪放大器，如无信号时的感知入H03G3/34)
G10L25/27	.以分析方法为特征的
G10L25/30	..利用神经网络