Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Bartosz Putrycz"

1.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US10706837B1

公开(公告)日：2020-07-07

申请号：US16007811

申请日：2018-06-13

Applicant: Amazon Technologies, Inc.

Inventor： Roberto Barra Chicote , Adam Franciszek Nadolski , Thomas Edward Merritt , Bartosz Putrycz , Andrew Paul Breen

IPC: G10L13/033 , G10L13/04 , G10L13/10

Abstract: A speech model includes a sub-model corresponding to a vocal attribute. The speech model generates an output waveform using a sample model, which receives text data, and a conditioning model, which receives text metadata and produces a prosody output for use by the sample model. If, during training or runtime, a different vocal attribute is desired or needed, the sub-model is re-trained or switched to a different sub-model corresponding to the different vocal attribute.

2.

发明授权
Text-to-speech task scheduling 有权

公开(公告)号：US09734817B1

公开(公告)日：2017-08-15

申请号：US14221985

申请日：2014-03-21

Applicant: Amazon Technologies, Inc.

Inventor： Bartosz Putrycz

IPC: G10L13/00 , G10L13/08

CPC classification number: G10L13/00 , G10L13/04 , G10L13/08

Abstract: To prioritize the processing text-to-speech (TTS) tasks, a TTS system may determine, for each task, an amount of time prior to the task reaching underrun, that is the time before the synthesized speech output to a user catches up to the time since a TTS task was originated. The TTS system may also prioritize tasks to reduce the amount of time between when a user submits a TTS request and when results are delivered to the user. When prioritizing tasks, such as allocating resources to existing tasks or accepting new tasks, the TTS system may prioritize tasks with the lowest amount of time prior to underrun and/or tasks with the longest time prior to delivery of first results.

3.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US11763797B2

公开(公告)日：2023-09-19

申请号：US16908882

申请日：2020-06-23

Applicant: Amazon Technologies, Inc.

Inventor： Roberto Barra Chicote , Adam Franciszek Nadolski , Thomas Edward Merritt , Bartosz Putrycz , Andrew Paul Breen

IPC: G10L13/10 , G10L13/033 , G10L13/00

CPC classification number: G10L13/033 , G10L13/00 , G10L13/10

Abstract: A speech model includes a sub-model corresponding to a vocal attribute. The speech model generates an output waveform using a sample model, which receives text data, and a conditioning model, which receives text metadata and produces a prosody output for use by the sample model. If, during training or runtime, a different vocal attribute is desired or needed, the sub-model is re-trained or switched to a different sub-model corresponding to the different vocal attribute.

4.

发明授权
Text-to-speech task scheduling 有权

公开(公告)号：US10546573B1

公开(公告)日：2020-01-28

申请号：US15673838

申请日：2017-08-10

Applicant: Amazon Technologies, Inc.

Inventor： Bartosz Putrycz

IPC: G10L13/00 , G10L13/04 , G10L13/08

Abstract: To prioritize the processing text-to-speech (TTS) tasks, a TTS system may determine, for each task, an amount of time prior to the task reaching underrun, that is the time before the synthesized speech output to a user catches up to the time since a TTS task was originated. The TTS system may also prioritize tasks to reduce the amount of time between when a user submits a TTS request and when results are delivered to the user. When prioritizing tasks, such as allocating resources to existing tasks or accepting new tasks, the TTS system may prioritize tasks with the lowest amount of time prior to underrun and/or tasks with the longest time prior to delivery of first results.

5.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US10699695B1

公开(公告)日：2020-06-30

申请号：US16023370

申请日：2018-06-29

Applicant: Amazon Technologies, Inc.

Inventor： Adam Franciszek Nadolski , Daniel Korzekwa , Thomas Edward Merritt , Marco Nicolis , Bartosz Putrycz , Roberto Barra Chicote , Rafal Kuklinski , Wiktor Dolecki

IPC: G10L13/10 , G10L13/06 , G10L13/047

Abstract: During text-to-speech processing, audio data corresponding to a word part, word, or group of words is generated using a trained model and used by a unit selection engine to create output audio. The audio data is generated at least when an input word is unrecognized or when a cost of a unit selection is too high.

6.

发明授权
Text-to-speech (TTS) processing 有权

公开(公告)号：US10692484B1

公开(公告)日：2020-06-23

申请号：US16007757

申请日：2018-06-13

Applicant: Amazon Technologies, Inc.

Inventor： Thomas Edward Merritt , Adam Franciszek Nadolski , Nishant Prateek , Bartosz Putrycz , Roberto Barra Chicote , Vatsal Aggarwal , Andrew Paul Breen

IPC: G10L13/04 , G10L13/08 , G10L25/24 , G10L25/60 , G10L13/047

Abstract: A speech model is trained using multi-task learning. A first task may correspond to how well predicted audio matches training audio; a second task may correspond to a metric of perceived audio quality. The speech model may include, during training, layers related to the second task that are discarded at runtime.

Patent Agency Ranking