Patent search ap:("DeepMind Technologies Limited") AND inv:"Heiga Zen" Page 1

1.

发明授权
Sample-efficient adaptive text-to-speech 有权

公开(公告)号：US11355097B2

公开(公告)日：2022-06-07

申请号：US17061437

申请日：2020-10-01

Applicant: DeepMind Technologies Limited

Inventor： Yutian Chen , Scott Ellison Reed , Aaron Gerard Antonius van den Oord , Oriol Vinyals , Heiga Zen , Ioannis Alexandros Assael , Brendan Shillingford , Joao Ferdinando Gomes de Freitas

IPC: G10L13/047 , G10L13/033 , G10L13/00 , G06N3/04 , G06N3/08

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an adaptive audio-generation model. One of the methods includes generating an adaptive audio-generation model including learning a plurality of embedding vectors and parameter values of a neural network using training data comprising first text and audio data representing a plurality of different individual speakers speaking portions of the first text, wherein the plurality of embedding vectors represent respective voice characteristics of the plurality of different individual speakers. The adaptive audio-generation model is adapted for a new individual speaker using adaptation data comprising second text and audio data representing the new individual speaker speaking portions of the second text, the new individual speaker being different from each of the plurality of individual speakers, wherein adapting the audio-generation model includes learning a new embedding vector for the new individual speaker.

2.

发明授权
Sample-efficient adaptive text-to-speech 有权

公开(公告)号：US10810993B2

公开(公告)日：2020-10-20

申请号：US16666043

申请日：2019-10-28

Applicant: DeepMind Technologies Limited

Inventor： Yutian Chen , Scott Ellison Reed , Aaron Gerard Antonius van den Oord , Oriol Vinyals , Heiga Zen , Ioannis Alexandros Assael , Brendan Shillingford , Joao Ferdinando Gomes de Freitas

IPC: G10L13/047 , G06N3/08 , G10L13/033 , G10L13/04

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an adaptive audio-generation model. One of the methods includes generating an adaptive audio-generation model including learning a plurality of embedding vectors and parameter values of a neural network using training data comprising first text and audio data representing a plurality of different individual speakers speaking portions of the first text, wherein the plurality of embedding vectors represent respective voice characteristics of the plurality of different individual speakers. The adaptive audio-generation model is adapted for a new individual speaker using adaptation data comprising second text and audio data representing the new individual speaker speaking portions of the second text, the new individual speaker being different from each of the plurality of individual speakers, wherein adapting the audio-generation model includes learning a new embedding vector for the new individual speaker.

3.

发明申请
SAMPLE-EFFICIENT ADAPTIVE TEXT-TO-SPEECH 有权

公开(公告)号：US20210020160A1

公开(公告)日：2021-01-21

申请号：US17061437

申请日：2020-10-01

Applicant: DeepMind Technologies Limited

Inventor： Yutian Chen , Scott Ellison Reed , Aaron Gerard Antonius van den Oord , Oriol Vinyals , Heiga Zen , Ioannis Alexandros Assael , Brendan Shillingford , Joao Ferdinando Gomes de Freitas

IPC: G10L13/047 , G10L13/033 , G06N3/08 , G10L13/00

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an adaptive audio-generation model. One of the methods includes generating an adaptive audio-generation model including learning a plurality of embedding vectors and parameter values of a neural network using training data comprising first text and audio data representing a plurality of different individual speakers speaking portions of the first text, wherein the plurality of embedding vectors represent respective voice characteristics of the plurality of different individual speakers. The adaptive audio-generation model is adapted for a new individual speaker using adaptation data comprising second text and audio data representing the new individual speaker speaking portions of the second text, the new individual speaker being different from each of the plurality of individual speakers, wherein adapting the audio-generation model includes learning a new embedding vector for the new individual speaker.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification