Generating audio data using unaligned text inputs with an adversarial network

Invention Grant

US12288547B2 Generating audio data using unaligned text inputs with an adversarial network 有权

Please log in to see more content

Patent Title: Generating audio data using unaligned text inputs with an adversarial network
Application No.: US17339834

Application Date: 2021-06-04
Publication No.: US12288547B2

Publication Date: 2025-04-29
Inventor: Jeffrey Donahue , Karen Simonyan , Sander Etienne Lea Dieleman , Mikolaj Binkowski , Erich Konrad Elsen
Applicant: DeepMind Technologies Limited
Applicant Address: GB London
Assignee: DeepMind Technologies Limited
Current Assignee: DeepMind Technologies Limited
Current Assignee Address: GB London
Agency: Fish & Richardson P.C.
Main IPC: G10L13/047
IPC: G10L13/047 ; G06N3/04 ; G06N3/08

Generating audio data using unaligned text inputs with an adversarial network

Abstract:

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using a generative neural network to convert conditioning text inputs to audio outputs. The generative neural network includes an alignment neural network that is configured to receive a generative input that includes the conditioning text input and to process the generative input to generate an aligned conditioning sequence that comprises a respective feature representation at each of a plurality of first time steps and that is temporally aligned with the audio output.

Public/Granted literature

US20210383789A1 GENERATING AUDIO DATA USING UNALIGNED TEXT INPUTS WITH AN ADVERSARIAL NETWORK Public/Granted day:2021-12-09

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统
G10L13/02	.产生合成语音的方法；语音合成设备
G10L13/04	..语音合成系统的零部件，例如合成设备结构或存储器管理
G10L13/047	...语音合成设备的体系结构