-
公开(公告)号:WO2019222591A1
公开(公告)日:2019-11-21
申请号:PCT/US2019/032815
申请日:2019-05-17
Applicant: GOOGLE LLC
Inventor: JIA, Ye , CHEN, Zhifeng , WU, Yonghui , SHEN, Jonathan , PANG, Ruoming , WEISS, Ron J. , MORENO, Ignacio Lopez , REN, Fei , ZHANG, Yu , WANG, Quan , NGUYEN, Patrick An Phu
IPC: G10L13/033 , G10L13/04 , G10L25/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.