- 专利标题: Artificial intelligence-based text-to-speech system and method
-
申请号: US16446833申请日: 2019-06-20
-
公开(公告)号: US11244669B2公开(公告)日: 2022-02-08
- 发明人: Martin Reber , Vijeta Avijeet
- 申请人: Telepathy Labs, Inc.
- 申请人地址: US FL Tampa
- 专利权人: Telepathy Labs, Inc.
- 当前专利权人: Telepathy Labs, Inc.
- 当前专利权人地址: US FL Tampa
- 代理机构: Holland & Knight LLP
- 代理商 Michael T. Abramson
- 主分类号: G10L25/30
- IPC分类号: G10L25/30 ; G10L13/08 ; G06K9/62 ; G06N5/02 ; G06N3/02 ; G10L19/00 ; G10L13/04 ; G06N3/04 ; G06N3/08
摘要:
A technique improves training and speech quality of a text-to-speech (TTS) system having an artificial intelligence, such as a neural network. The TTS system is organized as a front-end subsystem and a back-end subsystem. The front-end subsystem is configured to provide analysis and conversion of text into input vectors, each having at least a base frequency, f0, a phenome duration, and a phoneme sequence that is processed by a signal generation unit of the back-end subsystem. The signal generation unit includes the neural network interacting with a pre-existing knowledgebase of phenomes to generate audible speech from the input vectors. The technique applies an error signal from the neural network to correct imperfections of the pre-existing knowledgebase of phenomes to generate audible speech signals. A back-end training system is configured to train the signal generation unit by applying psychoacoustic principles to improve quality of the generated audible speech signals.
公开/授权文献
信息查询