- 专利标题: Text-to-speech (TTS) processing
-
申请号: US16908882申请日: 2020-06-23
-
公开(公告)号: US11763797B2公开(公告)日: 2023-09-19
- 发明人: Roberto Barra Chicote , Adam Franciszek Nadolski , Thomas Edward Merritt , Bartosz Putrycz , Andrew Paul Breen
- 申请人: Amazon Technologies, Inc.
- 申请人地址: US WA Seattle
- 专利权人: Amazon Technologies, Inc.
- 当前专利权人: Amazon Technologies, Inc.
- 当前专利权人地址: US WA Seattle
- 代理机构: PIERCE ATWOOD LLP
- 主分类号: G10L13/10
- IPC分类号: G10L13/10 ; G10L13/033 ; G10L13/00
摘要:
A speech model includes a sub-model corresponding to a vocal attribute. The speech model generates an output waveform using a sample model, which receives text data, and a conditioning model, which receives text metadata and produces a prosody output for use by the sample model. If, during training or runtime, a different vocal attribute is desired or needed, the sub-model is re-trained or switched to a different sub-model corresponding to the different vocal attribute.
信息查询