Text-to-speech technology with early emission
    2.
    发明公开
    Text-to-speech technology with early emission 有权
    文字 - 喷泉技术发射

    公开(公告)号:EP2474972A1

    公开(公告)日:2012-07-11

    申请号:EP11150490.8

    申请日:2011-01-10

    申请人: Svox AG

    IPC分类号: G10L13/06 G10L15/12

    CPC分类号: G10L13/06 G10L15/12

    摘要: The method is creating a speech output from a succession of input linguistic target elements including target characteristics, where the speech output is formed by concatenating a sequence of selected waveform units, each selected waveform unit corresponding to an input linguistic target element. The method includes repeating iterative sequences of forward steps, backward steps and the creating of speech output until the forward steps have reached the final target element. The same optimal sequence of selected waveform units for all target elements of a succession of input linguistic target elements starting with an initial target element and ending with a final target element as the standard Viterbi search are emitted but the optimal units become available in a pipelined manner without requiring the calculation of path costs for the final target element and without complete backtracking form the final to the initial target element. The latency, i.e. the amount of computation time before outputting selected waveform units for a beginning part of the target sequence is much shorter than in a Viterbi search.

    摘要翻译: 该方法是从连续的输入语言目标元素创建语音输出,包括目标特征,其中语音输出是通过连接所选择的波形单元序列而形成的,每个所选波形单元对应于输入语言目标元素。 该方法包括重复前进步骤,后向步骤和创建语音输出的迭代序列,直到前进步骤达到最终目标元素。 发射一连串输入语言目标元素的所有目标元素的所选波形单元的相同最佳序列,从初始目标元素开始并以最终目标元素结束作为标准维特比搜索,但是最佳单元以流水线方式可用 而不需要计算最终目标元素的路径成本,并且没有完整的回溯形式到最初的目标元素的最终值。 等待时间,即在目标序列的起始部分输出所选择的波形单位之前的计算时间量比在维特比搜索中短得多。