SPEECH SYNTHESIS UNIT SELECTION
    1.
    发明公开

    公开(公告)号:EP3376498A1

    公开(公告)日:2018-09-19

    申请号:EP18160557.7

    申请日:2018-03-07

    申请人: Google LLC

    IPC分类号: G10L13/06 G10L13/07 G10L13/08

    摘要: A method of selecting units for speech synthesis includes receiving, by one or more computers of a text-to-speech system, data indicating text for speech synthesis; determining, by the one or more computers, a sequence of text units that each represent a respective portion of the text, the sequence of text units including at least a first text unit followed by a second text unit; determining, by the one or more computers, multiple paths of speech units that each represent the sequence of text units, wherein determining the multiple paths of speech units includes: selecting, from a speech unit corpus, a first speech unit that includes speech synthesis data representing the first text unit; selecting, from the speech unit corpus, multiple second speech units including speech synthesis data representing the second text unit, each of the multiple second speech units being determined based on (i) a join cost to concatenate the second speech unit with a first speech unit and (ii) a target cost indicating a degree that the second speech unit corresponds to the second text unit; and defining paths from the selected first speech unit to each of the multiple second speech units to include in the multiple paths of speech units; and providing, by the one or more computers of the text-to-speech system, synthesized speech data according to a path selected from among the multiple paths.

    Voice synthesis apparatus
    3.
    发明公开
    Voice synthesis apparatus 有权
    Gerätund Programm zur Synthese eines Sprachsignals

    公开(公告)号:EP2530672A2

    公开(公告)日:2012-12-05

    申请号:EP12170129.6

    申请日:2012-05-31

    发明人: Saino, Keijiro

    IPC分类号: G10L13/06

    摘要: An apparatus is designed for synthesizing a voice signal using a plurality of phonetic piece data each indicating a phonetic piece which contains at least two phoneme sections corresponding to different phonemes. In the apparatus, a phonetic piece adjustor forms a target section from a first phonetic piece and a second phonetic piece so as to connect the first phonetic piece and the second phonetic piece to each other such that the target section is formed of a rear phoneme section of the first phonetic piece and a front phoneme section of the second phonetic piece, and expands the target section by a target time length to form an adjustment section such that a central part of the target section is expanded at an expansion rate higher than that of a front part and a rear part of the target section, to thereby create synthesized phonetic piece data of the adjustment section having the target time length. A voice synthesizer creates a voice signal from the synthesized phonetic piece data created by the phonetic piece adjustment part.

    摘要翻译: 一种设备用于使用多个语音片数据来合成语音信号,每个语音片段数据指示包含至少两个对应于不同音素的音素部分的语音片段。 在该装置中,语音片调整器从第一语音片和第二语音片形成目标部分,以将第一拼音和第二拼音彼此连接,使得目标部分由后音素部分 并将目标部分扩展目标时间长度,以形成调整部分,使得目标部分的中心部分以比第 目标部分的前部和后部,从而产生具有目标时间长度的调整部分的合成语音片数据。 语音合成器从由语音片调整部分创建的合成语音片数据创建语音信号。

    Concatenation of voice signals
    6.
    发明授权
    Concatenation of voice signals 有权
    语音信号的级联

    公开(公告)号:EP1403851B1

    公开(公告)日:2009-09-09

    申请号:EP02738817.2

    申请日:2002-06-27

    IPC分类号: G10L13/06

    CPC分类号: G10L13/07

    摘要: A signal coupling method and a signal coupling apparatus capable of creating a naturally combined speech with a reduced noise. The signal coupling method (or apparatus) couples a plurality of waveform signals to create a combined waveform signal by a step (or means) for deciding the upper limit frequency of each frequency spectrum of the plurality of waveform signals and a step (or means) for filtering at least coupled portion of each waveform signal by a predetermined cut−off frequency characteristic based on the decided upper limit frequency. Here, the filtering cut−off frequency is set to an upper limit frequency of a waveform signal preceding or following the coupled portion of the waveform signal having a higher upper limit frequency. Accordingly, a higher harmonic component generated by discontinuous change of the coupled portion of the waveform signals is effectively removed, thereby significantly reducing the noise of the combined waveform signal.

    Method and apparatus for speech synthesis without prosody modification
    7.
    发明公开
    Method and apparatus for speech synthesis without prosody modification 有权
    方法和语音合成装置,而不改变韵律

    公开(公告)号:EP1777697A3

    公开(公告)日:2008-06-18

    申请号:EP07002565.5

    申请日:2001-12-03

    发明人: Chu, Min Peng, Hu

    IPC分类号: G10L13/08 G10L13/06

    CPC分类号: G10L13/07 G10L13/04

    摘要: A speech synthesizer is provided that concatenates stored samples of speech units without modifying the prosody of the samples. The present invention is able to achieve a high level of naturalness in synthesized speech with a carefully designed training speech corpus by storing samples based on the prosodic and phonetic context in which they occur. In particular, some embodiments of the present invention limit the training text to those sentences that will produce the most frequent sets of prosodic contexts for each speech unit. Further embodiments of the present invention also provide a multi-tier selection mechanism for selecting a set of samples that will produce the most natural sounding speech.

    Text to speech synthesis
    8.
    发明公开
    Text to speech synthesis 有权
    文本祖SPRACHE-SYNTHESE

    公开(公告)号:EP1835488A1

    公开(公告)日:2007-09-19

    申请号:EP06111290.0

    申请日:2006-03-17

    申请人: Svox AG

    IPC分类号: G10L13/06 G10L13/02

    CPC分类号: G10L13/033 G10L13/07

    摘要: An input linguistic description is converted into a speech waveform by deriving at least one target unit sequence corresponding to the linguistic description, selecting from a waveform unit database for the target unit sequences a plurality of alternative unit sequences approximating the target unit sequences, concatenating the alternative unit sequences to alternative speech waveforms and choosing one of the alternative speech waveforms by an operating person. There are no iterative cycles of manual modification and automatic selection, which enables a fast way of working. The operator does not need knowledge of units, targets, and costs, but chooses from a set of given alternatives. The fine-tuning of TTS prompts therefore becomes accessible to non-experts.

    摘要翻译: 输入语言描述通过导出与语言描述相对应的至少一个目标单元序列而被转换为语音波形,从波形单元数据库中选择目标单元序列多个替代单元序列近似目标单元序列,连接替代 单元序列到替代语音波形,并由操作者选择替代语音波形之一。 没有手动修改和自动选择的迭代循环,这使得快速的工作方式。 运营商不需要知道单位,目标和成本,而是从一组给定的替代方案中选择。 因此,TTS提示的微调可以由非专家访问。

    FAST WAVEFORM SYNCHRONIZATION FOR CONCATENATION AND TIME-SCALE MODIFICATION OF SPEECH
    9.
    发明授权
    FAST WAVEFORM SYNCHRONIZATION FOR CONCATENATION AND TIME-SCALE MODIFICATION OF SPEECH 有权
    快速波形SYNC语音信号的交联和时间比例修改

    公开(公告)号:EP1319227B1

    公开(公告)日:2007-03-14

    申请号:EP01970936.9

    申请日:2001-09-14

    IPC分类号: G10L11/00

    CPC分类号: G10L21/04 G10L13/07

    摘要: A synthesis method for concatenative speech synthesis is provided for efficiently concatenating waveform segments in the time-domain. A digital waveform provider produces an input sequence of digital waveform segments. A waveform concatenator concatenates the input segments by using waveform blending within a concatenation zone to synchronize, weight, and overlap-add selected portions o the input segments to produce a single digital waveform. The synchronizing includes determining a minimum weighted energy anchor in the selected portion of each input segment and aligning synchronization peaks in a local vicinity of each anchor.