Speed engine for analyzing symbolic text and producing the speech
equivalent thereof
    1.
    发明授权
    Speed engine for analyzing symbolic text and producing the speech equivalent thereof 失效
    用于分析符号文本并产生其等同物的速度引擎

    公开(公告)号:US5852802A

    公开(公告)日:1998-12-22

    申请号:US847246

    申请日:1997-05-01

    摘要: A speech engine for producing synthetic speech from an input in convention orthography. The speech engine analyses the input data into small elements which are used to produce the synthetic speech. The analysis is carried out with the aid of a skeletal database 11 and a plurality of symbolic processor 12-16 each of which is adapted to preform one linguistic task. Each processor 13-16 obtains its data from the database 11 (processor 12 obtains its data from an input buffer 10). Each processor returns its results to the database 11. The database 11 is organised in accordance with the linguistic structures so that the results and intermediate results are not only stored but the linguistic relationships are also available. Preferably the database 11 is formed of a plurality of storage modules (1/1-5/7) each of which has an address. Each module has a register 100 which holds an item of data being either an intermediary or final result. In addition each module contains addresses of related modules 101, 102, 103 whereby the linguistic structure of the sentence is defined.

    摘要翻译: 一种语音引擎,用于从常规拼写输入中输入合成语音。 语音引擎将输入数据分析成用于产生合成语音的小元素。 借助于骨架数据库11和多个符号处理器12-16进行分析,每个符号处理器12-16适于预处理一个语言任务。 每个处理器13-16从数据库11获得其数据(处理器12从输入缓冲器10获得其数据)。 每个处理器将其结果返回到数据库11.数据库11根据语言结构进行组织,使得结果和中间结果不仅被存储,而且语言关系也是可用的。 优选地,数据库11由多个具有地址的存储模块(1 / 1-5 / 7)组成。 每个模块具有寄存器100,该寄存器100保存作为中间或最终结果的数据项。 此外,每个模块包含相关模块101,102,103的地址,由此定义该句子的语言结构。

    Methods and apparatus for predicting prosody in speech synthesis
    2.
    发明授权
    Methods and apparatus for predicting prosody in speech synthesis 有权
    用于预测语音合成中的韵律的方法和装置

    公开(公告)号:US09286886B2

    公开(公告)日:2016-03-15

    申请号:US13012740

    申请日:2011-01-24

    IPC分类号: G10L13/08 G10L13/10

    CPC分类号: G10L13/10 G10L13/08

    摘要: Techniques for predicting prosody in speech synthesis may make use of a data set of example text fragments with corresponding aligned spoken audio. To predict prosody for synthesizing an input text, the input text may be compared with the data set of example text fragments to select a best matching sequence of one or more example text fragments, each example text fragment in the sequence being paired with a portion of the input text. The selected example text fragment sequence may be aligned with the input text, e.g., at the word level, such that prosody may be extracted from the audio aligned with the example text fragments, and the extracted prosody may be applied to the synthesis of the input text using the alignment between the input text and the example text fragments.

    摘要翻译: 用于预测语音合成中的韵律的技术可以利用具有对应的口头音频的示例文本片段的数据集。 为了预测合成输入文本的韵律,可以将输入文本与示例文本片段的数据集进行比较,以选择一个或多个示例文本片段的最佳匹配序列,每个示例中的文本片段与一部分 输入文本。 所选择的示例文本片段序列可以与输入文本(例如,在字级别)对齐,使得可以从与示例文本片段对齐的音频中提取韵律,并且所提取的韵律可以应用于输入的合成 文本使用输入文本和示例文本片段之间的对齐。

    METHODS AND APPARATUS FOR PREDICTING PROSODY IN SPEECH SYNTHESIS
    3.
    发明申请
    METHODS AND APPARATUS FOR PREDICTING PROSODY IN SPEECH SYNTHESIS 有权
    用于预测语音合成中的前景的方法和装置

    公开(公告)号:US20120191457A1

    公开(公告)日:2012-07-26

    申请号:US13012740

    申请日:2011-01-24

    IPC分类号: G10L13/08

    CPC分类号: G10L13/10 G10L13/08

    摘要: Techniques for predicting prosody in speech synthesis may make use of a data set of example text fragments with corresponding aligned spoken audio. To predict prosody for synthesizing an input text, the input text may be compared with the data set of example text fragments to select a best matching sequence of one or more example text fragments, each example text fragment in the sequence being paired with a portion of the input text. The selected example text fragment sequence may be aligned with the input text, e.g., at the word level, such that prosody may be extracted from the audio aligned with the example text fragments, and the extracted prosody may be applied to the synthesis of the input text using the alignment between the input text and the example text fragments.

    摘要翻译: 用于预测语音合成中的韵律的技术可以利用具有对应的口头音频的示例文本片段的数据集。 为了预测合成输入文本的韵律,可以将输入文本与示例文本片段的数据集进行比较,以选择一个或多个示例文本片段的最佳匹配序列,每个示例中的文本片段与一部分 输入文本。 所选择的示例文本片段序列可以与输入文本(例如,在字级别)对齐,使得可以从与示例文本片段对齐的音频中提取韵律,并且所提取的韵律可以应用于输入的合成 文本使用输入文本和示例文本片段之间的对齐。