Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Andrew Paul Breen"

11.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US10741169B1

公开(公告)日：2020-08-11

申请号：US16141241

申请日：2018-09-25

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/08 , G10L13/10 , G10L25/18 , G10L13/06

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

12.

发明公开
TEXT-TO-SPEECH (TTS) PROCESSING 审中-公开

公开(公告)号：US20240296827A1

公开(公告)日：2024-09-05

申请号：US18664461

申请日：2024-05-15

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/10 , G10L13/06 , G10L25/18

CPC classification number: G10L13/10 , G10L13/06 , G10L25/18

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

13.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US11735162B2

公开(公告)日：2023-08-22

申请号：US17882691

申请日：2022-08-08

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/10 , G10L25/18 , G10L13/06

CPC classification number: G10L13/10 , G10L13/06 , G10L25/18

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

14.

发明申请
SYNTHETIC SPEECH PROCESSING 有权

公开(公告)号：US20230113297A1

公开(公告)日：2023-04-13

申请号：US17836330

申请日：2022-06-09

Applicant: Amazon Technologies, Inc.

Inventor： Antonio Bonafonte , Panagiotis Agis Oikonomou Filandras , Bartosz Perz , Arent van Korlaar , Ioannis Douratsos , Jonas Felix Ananda Rohnke , Elena Sokolova , Andrew Paul Breen , Nikhil Sharma

IPC: G10L13/10 , G10L13/047 , G10L15/16 , G10L15/18 , G10L15/22

Abstract: A speech-processing system receives both text data and natural-understanding data (e.g., a domain, intent, and/or entity) related to a command represented in the text data. The system uses the natural-understanding data to vary vocal characteristics in determining spectrogram data corresponding to the text data based on the natural-understanding data.

15.

发明授权
Text-to-speech processing using input voice characteristic data 有权

公开(公告)号：US11373633B2

公开(公告)日：2022-06-28

申请号：US16586007

申请日：2019-09-27

Applicant: Amazon Technologies, Inc.

Inventor： Roberto Barra Chicote , Vatsal Aggarwal , Andrew Paul Breen , Javier Gonzalez Hernandez , Nishant Prateek

IPC: G10L13/033 , G10L13/047 , G10L15/18 , G10L13/10 , G06F40/30

Abstract: During text-to-speech processing, a speech model creates synthesized speech that corresponds to input data. The speech model may include an encoder for encoding the input data into a context vector and a decoder for decoding the context vector into spectrogram data. The speech model may further include a voice decoder that receives vocal characteristic data representing a desired vocal characteristic of synthesized speech. The voice decoder may process the vocal characteristic data to determine configuration data, such as weights, for use by the speech decoder.

16.

发明申请
TEXT-TO-SPEECH PROCESSING 有权

公开(公告)号：US20210097976A1

公开(公告)日：2021-04-01

申请号：US16586007

申请日：2019-09-27

Applicant: Amazon Technologies, Inc.

Inventor： Roberto Barra Chicote , Vatsal Aggarwal , Andrew Paul Breen , Javier Gonzalez Hernandez , Nishant Prateek

IPC: G10L13/10 , G10L13/047 , G06F17/27 , G10L13/033

Abstract: During text-to-speech processing, a speech model creates synthesized speech that corresponds to input data. The speech model may include an encoder for encoding the input data into a context vector and a decoder for decoding the context vector into spectrogram data. The speech model may further include a voice decoder that receives vocal characteristic data representing a desired vocal characteristic of synthesized speech. The voice decoder may process the vocal characteristic data to determine configuration data, such as weights, for use by the speech decoder.

17.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US12272350B2

公开(公告)日：2025-04-08

申请号：US18664461

申请日：2024-05-15

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/10 , G10L13/06 , G10L25/18

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

18.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US11990118B2

公开(公告)日：2024-05-21

申请号：US18206301

申请日：2023-06-06

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/10 , G10L13/06 , G10L25/18

CPC classification number: G10L13/10 , G10L13/06 , G10L25/18

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

19.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US11763797B2

公开(公告)日：2023-09-19

申请号：US16908882

申请日：2020-06-23

Applicant: Amazon Technologies, Inc.

Inventor： Roberto Barra Chicote , Adam Franciszek Nadolski , Thomas Edward Merritt , Bartosz Putrycz , Andrew Paul Breen

IPC: G10L13/10 , G10L13/033 , G10L13/00

CPC classification number: G10L13/033 , G10L13/00 , G10L13/10

Abstract: A speech model includes a sub-model corresponding to a vocal attribute. The speech model generates an output waveform using a sample model, which receives text data, and a conditioning model, which receives text metadata and produces a prosody output for use by the sample model. If, during training or runtime, a different vocal attribute is desired or needed, the sub-model is re-trained or switched to a different sub-model corresponding to the different vocal attribute.

20.

发明授权
Synthetic speech processing 有权

公开(公告)号：US11367431B2

公开(公告)日：2022-06-21

申请号：US16818542

申请日：2020-03-13

Applicant: Amazon Technologies, Inc.

Inventor： Antonio Bonafonte , Panagiotis Agis Oikonomou Filandras , Bartosz Perz , Arent van Korlaar , Ioannis Douratsos , Jonas Felix Ananda Rohnke , Elena Sokolova , Andrew Paul Breen , Nikhil Sharma

IPC: G10L13/10 , G10L13/047 , G10L15/16 , G10L15/18 , G10L15/22

Abstract: A speech-processing system receives both text data and natural-understanding data (e.g., a domain, intent, and/or entity) related to a command represented in the text data. The system uses the natural-understanding data to vary vocal characteristics in determining spectrogram data corresponding to the text data based on the natural-understanding data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification