Pitch shift method with conserved timbre
    21.
    发明授权
    Pitch shift method with conserved timbre 失效
    音调偏移方法具有保守的音色

    公开(公告)号:US5872727A

    公开(公告)日:1999-02-16

    申请号:US752014

    申请日:1996-11-19

    申请人: Chih-Chung Kuo

    发明人: Chih-Chung Kuo

    摘要: An improved method for shifting the pitches of a tone is disclosed. It comprises: (a) subjecting a digitized original waveform to a whitening process using an all-zero filter (AZF) to obtain a whitened waveform; (b) resampling the whitened waveform at a desired scaling ratio to obtain a scaled and whitened waveform; (c) subjecting the scaled and whitened waveform to a coloring process using an all-pole filter (APF) to obtain a synthesized waveform. In a preferred embodiment, the all-zero filter performs the transformation function of: ##EQU1## and the all-pole filter performs the transformation function of: ##EQU2## wherein the a.sub.i 's and b.sub.i 's are linear predictive coefficients. The whitened waveforms can be compressed and stored as wavetables, which can be subsequently retrieved and decompressed before resampling.

    摘要翻译: 公开了一种改变音调音高的方法。 它包括:(a)使用全零滤波器(AZF)对数字化的原始波形进行白化处理以获得白化波形; (b)以期望的比例比对白化的波形进行重采样以获得标度和白化的波形; (c)使用全极滤波器(APF)对经缩放和白化的波形进行着色处理,以获得合成波形。 在一个优选实施例中,全零滤波器执行以下变换函数:< IMAGE>全极滤波器执行以下变换函数:< IMAGE>其中,ai和bi是线性预测系数。 白化的波形可以压缩并存储为波形图,可以在重新采样之前随后检索和解压缩。

    MULTI-LINGUAL TEXT-TO-SPEECH SYSTEM AND METHOD
    22.
    发明申请
    MULTI-LINGUAL TEXT-TO-SPEECH SYSTEM AND METHOD 有权
    多语言文字系统与方法

    公开(公告)号:US20120173241A1

    公开(公告)日:2012-07-05

    申请号:US13217919

    申请日:2011-08-25

    IPC分类号: G10L13/08

    CPC分类号: G10L13/086 G10L13/10

    摘要: A multi-lingual text-to-speech system and method processes a text to be synthesized via an acoustic-prosodic model selection module and an acoustic-prosodic model mergence module, and obtains a phonetic unit transformation table. In an online phase, the acoustic-prosodic model selection module, according to the text and a phonetic unit transcription corresponding to the text, uses at least a set controllable accent weighting parameter to select a transformation combination and find a second and a first acoustic-prosodic models. The acoustic-prosodic model mergence module merges the two acoustic-prosodic models into a merged acoustic-prosodic model, according to the at least a controllable accent weighting parameter, processes all transformations in the transformation combination and generates a merged acoustic-prosodic model sequence. A speech synthesizer and the merged acoustic-prosodic model sequence are further applied to synthesize the text into an L1-accent L2 speech.

    摘要翻译: 多语言文字到语音系统和方法通过声韵律模型选择模块和声韵模型合并模块处理要合成的文本,并获得语音单元变换表。 在在线阶段,根据文本的语音韵律模型选择模块和对应于文本的语音单元转录使用至少一组可控重音加权参数来选择变换组合,并且找到第二和第一声 - 韵律模型。 声韵模型合并模块根据至少一个可控重音加权参数将两个声韵声模型合并成一个合并声韵声模型,处理变换组合中的所有变换,并产生合并的声韵模型序列。 语音合成器和合并声韵声模型序列进一步应用于将文本合成为L1重音L2语音。

    Speech synthesizer generating system and method thereof
    23.
    发明授权
    Speech synthesizer generating system and method thereof 有权
    语音合成器生成系统及其方法

    公开(公告)号:US08055501B2

    公开(公告)日:2011-11-08

    申请号:US11875944

    申请日:2007-10-21

    IPC分类号: G10L13/00

    CPC分类号: G10L13/047

    摘要: A speech synthesizer generating system and a method thereof are provided. A speech synthesizer generator in the speech synthesizer generating system automatically generates a speech synthesizer conforming to a speech output specification input by a user. In addition, a recording script is automatically generated by a recording script generator in the speech synthesizer generating system according to the speech output specification, and a customized or expanded speech material is recorded according to the recording script. After the speech material is uploaded to the speech synthesizer generating system, the speech synthesizer generator automatically generates a speech synthesizer conforming to the speech output specification. The speech synthesizer then synthesizes and outputs a speech output at a user end.

    摘要翻译: 提供语音合成器生成系统及其方法。 语音合成器生成系统中的语音合成器生成器自动生成符合用户输入的语音输出规格的语音合成器。 此外,根据语音输出规格,通过语音合成器生成系统中的记录脚本生成器自动生成记录脚本,并且根据记录脚本记录定制或扩展的语音素材。 在将语音素材上传到语音合成器生成系统之后,语音合成器发生器自动生成符合语音输出规范的语音合成器。 语音合成器然后在用户端合成并输出语音输出。

    Method for speech quality degradation estimation and method for degradation measures calculation and apparatuses thereof
    24.
    发明授权
    Method for speech quality degradation estimation and method for degradation measures calculation and apparatuses thereof 有权
    用于语音质量劣化估计的方法和用于降级测量计算的方法及其装置

    公开(公告)号:US07801725B2

    公开(公告)日:2010-09-21

    申请号:US11427777

    申请日:2006-06-29

    IPC分类号: G10L11/04 G10L13/06

    CPC分类号: G10L25/69

    摘要: A method for speech quality degradation estimation, a method for degradation measures calculation, and the apparatuses thereof are provided. The first method above estimates the speech quality of a speech signal that is modified by a pitch-synchronous prosody modification method, which comprises the following steps. First, extract at least one source pitchmark from the speech signal, and then maps the source pitchmark(s) to at least one target pitchmark(s). Finally, calculate at least one degradation measure based on the mapping between the source and the target pitchmarks. The degradation measures include several weighted pitch-related functions and duration-related functions, where the weighting functions can be calculated based on the speech signal or the pitchmark(s) mapping mentioned above.

    摘要翻译: 提供了一种用于语音质量劣化估计的方法,一种降级测量计算方法及其装置。 上述第一种方法估计由音调同步韵律修改方法修改的语音信号的语音质量,其包括以下步骤。 首先,从语音信号中提取至少一个源间距标记,然后将源间距标记映射到至少一个目标间距标记。 最后,基于源和目标音标之间的映射计算至少一个降级度量。 降级措施包括几个加权音调相关功能和持续时间相关功能,其中可以基于上述的语音信号或音调标记映射来计算加权函数。

    Method for generating text script of high efficiency
    25.
    发明授权
    Method for generating text script of high efficiency 有权
    生成高效文本脚本的方法

    公开(公告)号:US07447625B2

    公开(公告)日:2008-11-04

    申请号:US10384938

    申请日:2003-03-10

    IPC分类号: G06F17/27

    CPC分类号: G10L13/08

    摘要: This proposal presents performance indices and search criteria for the text script generation in the design of corpus-based TTS systems. Based on our criteria a new search method is presented to solve the text selection problem more systematically and efficiently, unlike previous researches either concentrated on covering rate or on hit rate. By control a weighting factor, the covering rate of unit types can be increased to improve the robustness of the TTS system. Finally, the scalable and controllable design of the multi-stage search can produce various kinds of text scripts ideally suitable for the requirement of various kinds of corpus-based TTS systems.

    摘要翻译: 该提案提出了基于语料库的TTS系统设计中文本脚本生成的性能指标和搜索标准。 基于我们的标准,提出了一种新的搜索方法来更系统和有效地解决文本选择问题,不同于以前的研究集中在覆盖率或命中率。 通过控制加权因子,可以提高单位类型的覆盖率,以提高TTS系统的鲁棒性。 最后,多级搜索的可扩展和可控设计可以产生理想地适合于各种基于语料库的TTS系统需求的各种文本脚本。

    Method of speech segment selection for concatenative synthesis based on prosody-aligned distance measure
    26.
    发明授权
    Method of speech segment selection for concatenative synthesis based on prosody-aligned distance measure 有权
    基于韵律对齐距离测度的级联合成语音段选择方法

    公开(公告)号:US07315813B2

    公开(公告)日:2008-01-01

    申请号:US10206213

    申请日:2002-07-29

    IPC分类号: G10L11/04

    CPC分类号: G10L13/07 G10L13/04

    摘要: A method of speech segment selection for concatenative synthesis based on prosody-aligned distance measure is disclosed. This method is based on comparison of speech segments segmented from a speech corpus, wherein speech segments are fully prosody-aligned to each other before distortion measure. With prosody alignment embedded in selection process, distortion resulting from possible prosody modification in synthesis could be taken into account objectively in selection phase. In order to carry out the purpose of the present invention, automatic segmentation, pitch marking and PSOLA method work together for prosody alignment. Two distortion measures, MFCC and PSQM are used for comparing two prosody-aligned segments of speech because of human perceptual consideration.

    摘要翻译: 公开了一种基于韵律对齐距离测度的级联合成语音段选择方法。 这种方法是基于从语音语料库分割的语音段的比较,其中语音段在失真测量之前完全被韵律对准。 随着选择过程中嵌入韵律对齐,可以在选择阶段客观地考虑合成中可能的韵律修饰引起的失真。 为了实现本发明的目的,自动分割,间距标记和PSOLA方法一起工作以进行韵律对齐。 两种失真措施,MFCC和PSQM用于比较两个韵律对齐的言语段,因为人类的感知考虑。