Invention Grant
- Patent Title: Generating audio data using unaligned text inputs with an adversarial network
-
Application No.: US17339834Application Date: 2021-06-04
-
Publication No.: US12288547B2Publication Date: 2025-04-29
- Inventor: Jeffrey Donahue , Karen Simonyan , Sander Etienne Lea Dieleman , Mikolaj Binkowski , Erich Konrad Elsen
- Applicant: DeepMind Technologies Limited
- Applicant Address: GB London
- Assignee: DeepMind Technologies Limited
- Current Assignee: DeepMind Technologies Limited
- Current Assignee Address: GB London
- Agency: Fish & Richardson P.C.
- Main IPC: G10L13/047
- IPC: G10L13/047 ; G06N3/04 ; G06N3/08

Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a generative neural network to convert conditioning text inputs to audio outputs. The generative neural network includes an alignment neural network that is configured to receive a generative input that includes the conditioning text input and to process the generative input to generate an aligned conditioning sequence that comprises a respective feature representation at each of a plurality of first time steps and that is temporally aligned with the audio output.
Public/Granted literature
- US20210383789A1 GENERATING AUDIO DATA USING UNALIGNED TEXT INPUTS WITH AN ADVERSARIAL NETWORK Public/Granted day:2021-12-09
Information query