HIGH FIDELITY SPEECH SYNTHESIS WITH ADVERSARIAL NETWORKS

    公开(公告)号:US20210089909A1

    公开(公告)日:2021-03-25

    申请号:US17032578

    申请日:2020-09-25

    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output audio examples using a generative neural network. One of the methods includes obtaining a training conditioning text input; processing a training generative input comprising the training conditioning text input using a feedforward generative neural network to generate a training audio output; processing the training audio output using each of a plurality of discriminators, wherein the plurality of discriminators comprises one or more conditional discriminators and one or more unconditional discriminators; determining a first combined prediction by combining the respective predictions of the plurality of discriminators; and determining an update to current values of a plurality of generative parameters of the feedforward generative neural network to increase a first error in the first combined prediction.

Patent Agency Ranking