-
公开(公告)号:US12277672B2
公开(公告)日:2025-04-15
申请号:US17613694
申请日:2020-05-22
Applicant: DeepMind Technologies Limited
Inventor: Aidan Clark , Jeffrey Donahue , Karen Simonyan
IPC: G06T3/4046 , G06N3/045 , G06N3/08
Abstract: The present disclosure proposes the use of a dual discriminator network that comprises a temporal discriminator network for discriminating based on temporal features of a series of images and a spatial discriminator network for discriminating based on spatial features of individual images. The training methods described herein provide improvements in computational efficiency. This is achieved by applying the spatial discriminator network to a set of one or more images that have reduced temporal resolution and applying the temporal discriminator network to a set of images that have reduced spatial resolution. This allows each of the discriminator networks to be applied more efficiently in order to produce a discriminator score for use in training the generator, whilst maintaining accuracy of the discriminator network. In addition, this allows a generator network to be trained to more accurately generate sequences of images, through the use of the improved discriminator.
-
公开(公告)号:US20210089909A1
公开(公告)日:2021-03-25
申请号:US17032578
申请日:2020-09-25
Applicant: DeepMind Technologies Limited
Inventor: Mikolaj Binkowski , Karen Simonyan , Jeffrey Donahue , Aidan Clark , Sander Etienne Lea Dieleman , Erich Konrad Elsen , Luis Carlos Cobo Rus , Norman Casagrande
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output audio examples using a generative neural network. One of the methods includes obtaining a training conditioning text input; processing a training generative input comprising the training conditioning text input using a feedforward generative neural network to generate a training audio output; processing the training audio output using each of a plurality of discriminators, wherein the plurality of discriminators comprises one or more conditional discriminators and one or more unconditional discriminators; determining a first combined prediction by combining the respective predictions of the plurality of discriminators; and determining an update to current values of a plurality of generative parameters of the feedforward generative neural network to increase a first error in the first combined prediction.
-
公开(公告)号:US12288547B2
公开(公告)日:2025-04-29
申请号:US17339834
申请日:2021-06-04
Applicant: DeepMind Technologies Limited
Inventor: Jeffrey Donahue , Karen Simonyan , Sander Etienne Lea Dieleman , Mikolaj Binkowski , Erich Konrad Elsen
IPC: G10L13/047 , G06N3/04 , G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a generative neural network to convert conditioning text inputs to audio outputs. The generative neural network includes an alignment neural network that is configured to receive a generative input that includes the conditioning text input and to process the generative input to generate an aligned conditioning sequence that comprises a respective feature representation at each of a plurality of first time steps and that is temporally aligned with the audio output.
-
公开(公告)号:US20210383789A1
公开(公告)日:2021-12-09
申请号:US17339834
申请日:2021-06-04
Applicant: DeepMind Technologies Limited
Inventor: Jeffrey Donahue , Karen Simonyan , Sander Etienne Lea Dieleman , Mikolaj Binkowski , Erich Konrad Elsen
IPC: G10L13/047 , G06N3/08 , G06N3/04
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a generative neural network to convert conditioning text inputs to audio outputs. The generative neural network includes an alignment neural network that is configured to receive a generative input that includes the conditioning text input and to process the generative input to generate an aligned conditioning sequence that comprises a respective feature representation at each of a plurality of first time steps and that is temporally aligned with the audio output.
-
5.
公开(公告)号:US20200372370A1
公开(公告)日:2020-11-26
申请号:US16882352
申请日:2020-05-22
Applicant: DeepMind Technologies Limited
Inventor: Jeffrey Donahue , Karen Simonyan
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a generator neural network and an encoder neural network. The generator neural network generates, based on a set of latent values, data items which are samples of a distribution. The encoder neural network generates a set of latent values for a respective data item. The training method comprises jointly training the generator neural network, the encoder neural network and a discriminator neural network configured to distinguish between samples generated by the generator network and samples of the distribution which are not generated by the generator network. The discriminator neural network is configured to distinguish by processing, by the discriminator neural network, an input pair comprising a sample part and a latent part.
-
公开(公告)号:US11875269B2
公开(公告)日:2024-01-16
申请号:US16882352
申请日:2020-05-22
Applicant: DeepMind Technologies Limited
Inventor: Jeffrey Donahue , Karen Simonyan
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a generator neural network and an encoder neural network. The generator neural network generates, based on a set of latent values, data items which are samples of a distribution. The encoder neural network generates a set of latent values for a respective data item. The training method comprises jointly training the generator neural network, the encoder neural network and a discriminator neural network configured to distinguish between samples generated by the generator network and samples of the distribution which are not generated by the generator network. The discriminator neural network is configured to distinguish by processing, by the discriminator neural network, an input pair comprising a sample part and a latent part.
-
公开(公告)号:US20230350936A1
公开(公告)日:2023-11-02
申请号:US18141337
申请日:2023-04-28
Applicant: DeepMind Technologies Limited
Inventor: Jean-Baptiste Alayrac , Jeffrey Donahue , Karel Lenc , Karen Simonyan , Malcolm Kevin Campbell Reynolds , Pauline Luc , Arthur Mensch , Iain Barr , Antoine Miech , Yana Elizabeth Hasson , Katherine Elizabeth Millican , Roman Ring
IPC: G06F16/432 , G06F40/284 , G06F16/438
CPC classification number: G06F16/432 , G06F16/438 , G06F40/284
Abstract: A query processing system is described which receives a query input comprising an input token string and also at least one data item having a second, different modality, and generates a corresponding output token string.
-
公开(公告)号:US20220230276A1
公开(公告)日:2022-07-21
申请号:US17613694
申请日:2020-05-22
Applicant: DeepMind Technologies Limited
Inventor: Aidan Clark , Jeffrey Donahue , Karen Simonyan
Abstract: The present disclosure proposes the use of a dual discriminator network that comprises a temporal discriminator network for discriminating based on temporal features of a series of images and a spatial discriminator network for discriminating based on spatial features of individual images. The training methods described herein provide improvements in computational efficiency. This is achieved by applying the spatial discriminator network to a set of one or more images that have reduced temporal resolution and applying the temporal discriminator network to a set of images that have reduced spatial resolution. This allows each of the discriminator networks to be applied more efficiently in order to produce a discriminator score for use in training the generator, whilst maintaining accuracy of the discriminator network. In addition, this allows a generator network to be trained to more accurately generate sequences of images, through the use of the improved discriminator.
-
-
-
-
-
-
-