Methods and apparatus for rapid acoustic unit selection from a large speech corpus
    1.
    发明申请
    Methods and apparatus for rapid acoustic unit selection from a large speech corpus 有权
    用于从大型语音语料库中快速声学单元选择的方法和装置

    公开(公告)号:US20030115049A1

    公开(公告)日:2003-06-19

    申请号:US10359171

    申请日:2003-02-06

    Applicant: AT&T CORP.

    CPC classification number: G10L13/07

    Abstract: A speech synthesis system can select recorded speech fragments, or acoustic units, from a very large database of acoustic units to produce artificial speech. The selected acoustic units are chosen to minimize a combination of target and concatenation costs for a given sentence. However, as concatenation costs, which are measures of the mismatch between sequential pairs of acoustic units, are expensive to compute, processing can be greatly reduced by pre-computing and caching the concatenation costs. Unfortunately, the number of possible sequential pairs of acoustic units makes such caching prohibitive. However, statistical experiments reveal that while about 85% of the acoustic units are typically used in common speech, less than 1% of the possible sequential pairs of acoustic units occur in practice. A method for constructing an efficient concatenation cost database is provided by synthesizing a large body of speech, identifying the acoustic unit sequential pairs generated and their respective concatenation costs, and storing those concatenation costs likely to occur. By constructing a concatenation cost database in this fashion, the processing power required at run-time is greatly reduced with negligible effect on speech quality.

    Abstract translation: 语音合成系统可以从声学单元的非常大的数据库中选择记录的语音片段或声学单元,以产生人造语音。 选择的声学单元被选择以最小化给定句子的目标和级联成本的组合。 然而,由于级联成本(即连续的声单元对之间的不匹配度量)是计算成本高的,所以可以通过预先计算和缓存级联成本大大降低处理能力。 不幸的是,可能的顺序对声学单元的数量使得这种高速缓存变得过高。 然而,统计学实验表明,虽然约85%的声学单位通常用于通用语音,但在实践中小于1%的可能顺序的声学单元对出现。 通过合成大量语音,识别产生的声学单元序列对及其各自的级联成本,并且存储可能发生的级联成本,提供了一种用于构建有效级联成本数据库的方法。 通过以这种方式构建级联成本数据库,运行时所需的处理能力大大降低,对语音质量的影响可以忽略不计。

Patent Agency Ranking