发明授权
- 专利标题: Synthesis of speech from text in a voice of a target speaker using neural networks
-
申请号: US17055951申请日: 2019-05-17
-
公开(公告)号: US11488575B2公开(公告)日: 2022-11-01
- 发明人: Ye Jia , Zhifeng Chen , Yonghui Wu , Jonathan Shen , Ruoming Pang , Ron J. Weiss , Ignacio Lopez Moreno , Fei Ren , Yu Zhang , Quan Wang , Patrick Nguyen
- 申请人: Google LLC
- 申请人地址: US CA Mountain View
- 专利权人: Google LLC
- 当前专利权人: Google LLC
- 当前专利权人地址: US CA Mountain View
- 代理机构: Honigman LLP
- 代理商 Brett A. Krueger
- 国际申请: PCT/US2019/032815 WO 20190517
- 国际公布: WO2019/222591 WO 20191121
- 主分类号: G10L13/04
- IPC分类号: G10L13/04 ; G10L17/04 ; G10L19/00 ; G06N3/08 ; G10L13/02
摘要:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.
公开/授权文献
信息查询