- 专利标题: Text-to-speech (TTS) processing with transfer of vocal characteristics
-
申请号: US16430894申请日: 2019-06-04
-
公开(公告)号: US11410684B1公开(公告)日: 2022-08-09
- 发明人: Viacheslav Klimkov , Thomas Renaud Drugman , Alexander Galkin , Srikanth Ronanki
- 申请人: Amazon Technologies, Inc.
- 申请人地址: US WA Seattle
- 专利权人: Amazon Technologies, Inc.
- 当前专利权人: Amazon Technologies, Inc.
- 当前专利权人地址: US WA Seattle
- 代理机构: Pierce Atwood LLP
- 主分类号: G10L13/00
- IPC分类号: G10L13/00 ; G10L25/78 ; G10L13/027 ; G10L15/16 ; G10L15/187 ; G06F16/38 ; G06N3/08 ; G06N20/20 ; G06F17/18 ; G06N3/04 ; G10L13/04 ; G10L13/033 ; G10L13/07
摘要:
Audio data from a first, source speaker is received and processed to determine linguistic units and vocal characteristics corresponding to those linguistic units. The linguistic units may either be determined from received text data or may be determined from the audio data using automatic speech recognition. A model is trained using training data from a second, target speaker. The trained model concatenates the linguistic units with the vocal characteristics to produce output speech that has the “voice” of the target speaker and the vocal characteristics of the source speaker.
信息查询