-
公开(公告)号:US12288547B2
公开(公告)日:2025-04-29
申请号:US17339834
申请日:2021-06-04
Applicant: DeepMind Technologies Limited
Inventor: Jeffrey Donahue , Karen Simonyan , Sander Etienne Lea Dieleman , Mikolaj Binkowski , Erich Konrad Elsen
IPC: G10L13/047 , G06N3/04 , G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a generative neural network to convert conditioning text inputs to audio outputs. The generative neural network includes an alignment neural network that is configured to receive a generative input that includes the conditioning text input and to process the generative input to generate an aligned conditioning sequence that comprises a respective feature representation at each of a plurality of first time steps and that is temporally aligned with the audio output.
-
公开(公告)号:US20210383789A1
公开(公告)日:2021-12-09
申请号:US17339834
申请日:2021-06-04
Applicant: DeepMind Technologies Limited
Inventor: Jeffrey Donahue , Karen Simonyan , Sander Etienne Lea Dieleman , Mikolaj Binkowski , Erich Konrad Elsen
IPC: G10L13/047 , G06N3/08 , G06N3/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a generative neural network to convert conditioning text inputs to audio outputs. The generative neural network includes an alignment neural network that is configured to receive a generative input that includes the conditioning text input and to process the generative input to generate an aligned conditioning sequence that comprises a respective feature representation at each of a plurality of first time steps and that is temporally aligned with the audio output.
-
公开(公告)号:US20240412042A1
公开(公告)日:2024-12-12
申请号:US18698260
申请日:2022-10-06
Applicant: DeepMind Technologies Limited
Inventor: Nikolay Savinov , Junyoung Chung , Mikolaj Binkowski , Aaron Gerard Antonius van den Oord , Erich Konrad Elsen
IPC: G06N3/0455 , G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences using a non-auto-regressive neural network.
-
公开(公告)号:US20210089909A1
公开(公告)日:2021-03-25
申请号:US17032578
申请日:2020-09-25
Applicant: DeepMind Technologies Limited
Inventor: Mikolaj Binkowski , Karen Simonyan , Jeffrey Donahue , Aidan Clark , Sander Etienne Lea Dieleman , Erich Konrad Elsen , Luis Carlos Cobo Rus , Norman Casagrande
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output audio examples using a generative neural network. One of the methods includes obtaining a training conditioning text input; processing a training generative input comprising the training conditioning text input using a feedforward generative neural network to generate a training audio output; processing the training audio output using each of a plurality of discriminators, wherein the plurality of discriminators comprises one or more conditional discriminators and one or more unconditional discriminators; determining a first combined prediction by combining the respective predictions of the plurality of discriminators; and determining an update to current values of a plurality of generative parameters of the feedforward generative neural network to increase a first error in the first combined prediction.
-
-
-