Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Viacheslav Klimkov"

1.

发明授权
Contextual text-to-speech processing 有权

公开(公告)号：US10475438B1

公开(公告)日：2019-11-12

申请号：US15447919

申请日：2017-03-02

Applicant: Amazon Technologies, Inc.

Inventor： Roberto Barra Chicote , Javier Latorre , Adam Franciszek Nadolski , Viacheslav Klimkov , Thomas Edward Merritt

IPC: G10L13/10 , G10L13/033 , G10L13/047

Abstract: A text-to-speech (TTS) system that is capable of considering characteristics of various portions of text data in order to create continuity between segments of synthesized speech. The system can analyze text portions of a work and create feature vectors including data corresponding to characteristics of the individual portions and/or the overall work. A TTS processing component can then consider feature vector(s) from other portions when performing TTS processing on text of a first portion, thus giving the TTS component some intelligence regarding other portions of the work, which can then result in more continuity between synthesized speech segments.

2.

发明授权
Synthetic speech processing 有权

公开(公告)号：US12154544B1

公开(公告)日：2024-11-26

申请号：US17205493

申请日：2021-03-18

Applicant: Amazon Technologies, Inc.

Inventor： Michal Czuczman , You Wang , Masaki Noguchi , Viacheslav Klimkov

IPC: G10L13/08 , G10L15/05 , G10L15/16 , G10L15/187

Abstract: A speech-processing system receives input data representing text. An encoder processes segments of the text to determine embedding data, and a decoder processes the embedding data to determine one or more categories associated with each segment. Output data is determined by selecting words based on the segments and categories.

3.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US12272350B2

公开(公告)日：2025-04-08

申请号：US18664461

申请日：2024-05-15

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/10 , G10L13/06 , G10L25/18

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

4.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US11990118B2

公开(公告)日：2024-05-21

申请号：US18206301

申请日：2023-06-06

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/10 , G10L13/06 , G10L25/18

CPC classification number: G10L13/10 , G10L13/06 , G10L25/18

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

5.

发明授权
Text-to-speech (TTS) processing with transfer of vocal characteristics 有权

公开(公告)号：US11410684B1

公开(公告)日：2022-08-09

申请号：US16430894

申请日：2019-06-04

Applicant: Amazon Technologies, Inc.

Inventor： Viacheslav Klimkov , Thomas Renaud Drugman , Alexander Galkin , Srikanth Ronanki

IPC: G10L13/00 , G10L25/78 , G10L13/027 , G10L15/16 , G10L15/187 , G06F16/38 , G06N3/08 , G06N20/20 , G06F17/18 , G06N3/04 , G10L13/04 , G10L13/033 , G10L13/07

Abstract: Audio data from a first, source speaker is received and processed to determine linguistic units and vocal characteristics corresponding to those linguistic units. The linguistic units may either be determined from received text data or may be determined from the audio data using automatic speech recognition. A model is trained using training data from a second, target speaker. The trained model concatenates the linguistic units with the vocal characteristics to produce output speech that has the “voice” of the target speaker and the vocal characteristics of the source speaker.

6.

发明公开
TEXT-TO-SPEECH (TTS) PROCESSING 审中-公开

公开(公告)号：US20240296827A1

公开(公告)日：2024-09-05

申请号：US18664461

申请日：2024-05-15

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/10 , G10L13/06 , G10L25/18

CPC classification number: G10L13/10 , G10L13/06 , G10L25/18

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

7.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US11735162B2

公开(公告)日：2023-08-22

申请号：US17882691

申请日：2022-08-08

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/10 , G10L25/18 , G10L13/06

CPC classification number: G10L13/10 , G10L13/06 , G10L25/18

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

8.

发明申请
CONTEXTUAL TEXT-TO-SPEECH PROCESSING 审中-公开

公开(公告)号：US20200152169A1

公开(公告)日：2020-05-14

申请号：US16665886

申请日：2019-10-28

Applicant: Amazon Technologies, Inc.

Inventor： Roberto Barra Chicote , Javier Latorre , Adam Franciszek Nadolski , Viacheslav Klimkov , Thomas Edward Merritt

IPC: G10L13/10 , G10L13/033 , G10L13/047

Abstract: A text-to-speech (TTS) system that is capable of considering characteristics of various portions of text data in order to create continuity between segments of synthesized speech. The system can analyze text portions of a work and create feature vectors including data corresponding to characteristics of the individual portions and/or the overall work. A TTS processing component can then consider feature vector(s) from other portions when performing TTS processing on text of a first portion, thus giving the TTS component some intelligence regarding other portions of the work, which can then result in more continuity between synthesized speech segments.

9.

发明公开
TEXT-TO-SPEECH (TTS) PROCESSING 审中-公开

公开(公告)号：US20240013770A1

公开(公告)日：2024-01-11

申请号：US18206301

申请日：2023-06-06

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/047

CPC classification number: G10L13/047

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

10.

发明申请
TEXT-TO-SPEECH (TTS) PROCESSING 有权

公开(公告)号：US20230058658A1

公开(公告)日：2023-02-23

申请号：US17882691

申请日：2022-08-08

Applicant: Amazon Technologies, Inc.

Inventor： Jaime Lorenzo Trueba , Thomas Renaud Drugman , Viacheslav Klimkov , Srikanth Ronanki , Thomas Edward Merritt , Andrew Paul Breen , Roberto Barra-Chicote

IPC: G10L13/10 , G10L25/18 , G10L13/06

Abstract: During text-to-speech processing, a speech model creates output audio data, including speech, that corresponds to input text data that includes a representation of the speech. A spectrogram estimator estimates a frequency spectrogram of the speech; the corresponding frequency-spectrogram data is used to condition the speech model. A plurality of acoustic features corresponding to different segments of the input text data, such as phonemes, syllable-level features, and/or word-level features, may be separately encoded into context vectors; the spectrogram estimator uses these separate context vectors to create the frequency spectrogram.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification