Multilingual neural text-to-speech synthesis

    公开(公告)号:US11922924B2

    公开(公告)日:2024-03-05

    申请号:US17617547

    申请日:2020-05-21

    发明人: Jingzhou Yang Lei He

    摘要: Method and apparatus for generating speech through multilingual neural text-to-speech (TTS) synthesis are provided in the present disclosure. A text input in at least a first language may be received. Speaker latent space information of a target speaker may be provided through a speaker encoder. Language latent space information of a second language may be provided through a language encoder. At least one acoustic feature may be generated, through an acoustic feature predictor, based on the text input, the speaker latent space information and the language latent space information of the second language. A speech waveform corresponding to the text input may be generated, through a neural vocoder, based on the at least one acoustic feature.