Method and apparatus of generating text script for a corpus-based text-to speech system
    2.
    发明授权
    Method and apparatus of generating text script for a corpus-based text-to speech system 有权
    为基于语料库的文本到语音系统生成文本脚本的方法和装置

    公开(公告)号:US08175865B2

    公开(公告)日:2012-05-08

    申请号:US11956336

    申请日:2007-12-14

    IPC分类号: G06F17/27

    CPC分类号: G10L13/08

    摘要: A method of text script generation for a corpus-based text-to-speech system includes searching in a source corpus having L sentences, selecting N sentences with a best integrated efficiency as N best cases, and setting iteration k to be 1; for each case n of the N best cases, selecting Mk+1 best sentences with the best integrated efficiency from the unselected sentences in the source corpus; keeping N best cases out of the total unselected sentences for next iteration, and increasing iteration k by 1; and if a termination criterion being reached, setting the best case in the N traced cases as the text script, otherwise, returning to the (k+1)th iteration of searching in the unselected sentences for (k+1)th sentence; wherein the best integrated efficiency depends on a function of combining the covering rate of the synthesis unit type, the hit rate of the synthesis unit type, and the text script size.

    摘要翻译: 一种基于语料库的文本到语音系统的文本脚本生成方法包括在具有L个句子的源语料库中进行搜索,以N个最佳情况选择具有最佳综合效率的N个句子,并将迭代k设置为1; 对于N个最佳案例中的每个案例,从源语料库中未选择的句子中选择具有最佳综合效率的Mk + 1个最佳句子; 将N个最佳案例保留在下一次迭代中的未选择句子中,并将迭代k增加1; 如果达到终止标准,则将N个追踪案例中的最佳案例设置为文本脚本,否则返回到第(k + 1)个句子的未选择句子中的第(k + 1)次迭代; 其中最佳集成效率取决于组合合成单元类型的覆盖率,合成单元类型的命中率和文本脚本大小的功能。

    Automatic speech segmentation and verification using segment confidence measures
    3.
    发明授权
    Automatic speech segmentation and verification using segment confidence measures 有权
    自动语音分段和验证使用段置信度度量

    公开(公告)号:US07472066B2

    公开(公告)日:2008-12-30

    申请号:US10782955

    申请日:2004-02-23

    IPC分类号: G10L13/06 G10L13/00

    CPC分类号: G10L15/04 G10L13/06

    摘要: An automatic speech segmentation and verification system and method is disclosed, which has a known text script and a recorded speech corpus corresponding to the known text script. A speech unit segmentor segments the recorded speech corpus into N test speech unit segments referring to the phonetic information of the known text script. Then, a segmental verifier is applied to obtain a confidence measure of syllable segmentation for verifying the correctness of the cutting points of test speech unit segments. A phonetic verifier obtains a confidence measure of syllable verification by using verification models for verifying whether the recorded speech corpus is correctly recorded. Finally, a speech unit inspector integrates the confidence measure of syllable segmentation and the confidence measure of syllable verification to determine whether the test speech unit segment is accepted or not.

    摘要翻译: 公开了一种自动语音分段和验证系统和方法,其具有与已知文本脚本相对应的已知文本脚本和记录的语音语料库。 参考已知文本脚本的语音信息,语音单元分段器将记录的语音语料库分割成N个测试语音单元段。 然后,应用分段验证器来获得音节分割的置信度,以验证测试语音单元段的切割点的正确性。 语音验证器通过使用验证模型来获得音节验证的置信度量度,以验证录制的语音库是否被正确记录。 最后,语音单元检查器整合了音节分割的置信度量度和音节验证的置信度度量,以确定测试语音单元段是否被接受。

    METHOD FOR SPEECH QUALITY DEGRADATION ESTIMATION AND METHOD FOR DEGRADATION MEASURES CALCULATION AND APPARATUSES THEREOF
    4.
    发明申请
    METHOD FOR SPEECH QUALITY DEGRADATION ESTIMATION AND METHOD FOR DEGRADATION MEASURES CALCULATION AND APPARATUSES THEREOF 有权
    演讲质量降低估算方法及其降解措施计算方法及其设备

    公开(公告)号:US20070233469A1

    公开(公告)日:2007-10-04

    申请号:US11427777

    申请日:2006-06-29

    IPC分类号: G10L11/04

    CPC分类号: G10L25/69

    摘要: A method for speech quality degradation estimation, a method for degradation measures calculation, and the apparatuses thereof are provided. The first method above estimates the speech quality of a speech signal that is modified by a pitch-synchronous prosody modification method, which comprises the following steps. First, extract at least one source pitchmark from the speech signal, and then maps the source pitchmark(s) to at least one target pitchmark(s). Finally, calculate at least one degradation measure based on the mapping between the source and the target pitchmarks. The degradation measures include several weighted pitch-related functions and duration-related functions, where the weighting functions can be calculated based on the speech signal or the pitchmark(s) mapping mentioned above.

    摘要翻译: 提供了一种用于语音质量劣化估计的方法,一种降级测量计算方法及其装置。 上述第一种方法估计由音调同步韵律修改方法修改的语音信号的语音质量,其包括以下步骤。 首先,从语音信号中提取至少一个源间距标记,然后将源间距标记映射到至少一个目标间距标记。 最后,基于源和目标音标之间的映射计算至少一个降级度量。 降级措施包括几个加权音调相关功能和持续时间相关功能,其中可以基于上述的语音信号或音调标记映射来计算加权函数。

    SYSTEM AND METHOD FOR PROVIDING MOBILE INFORMATION SERVER AND PORTABLE DEVICE THEREIN
    5.
    发明申请
    SYSTEM AND METHOD FOR PROVIDING MOBILE INFORMATION SERVER AND PORTABLE DEVICE THEREIN 有权
    用于提供移动信息服务器和便携式设备的系统和方法

    公开(公告)号:US20070180058A1

    公开(公告)日:2007-08-02

    申请号:US11309059

    申请日:2006-06-15

    IPC分类号: G06F15/16

    CPC分类号: G06F17/30899

    摘要: A system and method for providing mobile information, a server and a portable device therein are provided. The server comprises an intelligent download manager and the portable device comprises a file browse manager. The intelligent download manager determines a downloaded file update rule and a file browse rule according to any combination of a document attribute, a browse record, and a document preference. The files to be downloaded to the portable device can be determined automatically according to the downloaded file update rule. The file browse manager provides an intelligent browse mode related to the browse sequence of the downloaded files according to the file browse rule. Therefore, information really interesting to the user can be stored in the limited space by the present invention and the user can access the information quickly and efficiently.

    摘要翻译: 提供了一种用于在其中提供移动信息,服务器和便携式设备的系统和方法。 服务器包括智能下载管理器,并且便携式设备包括文件浏览管理器。 智能下载管理器根据文档属性,浏览记录和文档偏好的任意组合来确定下载的文件更新规则和文件浏览规则。 可以根据下载的文件更新规则自动确定要下载到便携式设备的文件。 文件浏览管理器根据文件浏览规则提供与下载文件的浏览顺序相关的智能浏览模式。 因此,通过本发明,可以将用户真正有趣的信息存储在有限的空间中,并且用户可以快速有效地访问信息。

    Pronunciation assessment method and system based on distinctive feature analysis
    6.
    发明申请
    Pronunciation assessment method and system based on distinctive feature analysis 有权
    基于特征分析的发音评估方法和系统

    公开(公告)号:US20060136225A1

    公开(公告)日:2006-06-22

    申请号:US11157606

    申请日:2005-06-21

    IPC分类号: G10L11/00

    CPC分类号: G10L25/48 G10L2015/025

    摘要: A method and system for pronunciation assessment based on distinctive feature analysis is provided. It evaluates a user's pronunciation by one or more distinctive feature (DF) assessor. It may further construct a phone assessor with DF assessors to evaluate a user's phone pronunciation, and even construct a continuous speech pronunciation assessor with phone assessor to get the final pronunciation score for a word or a sentence. Each DF assessor further includes a feature extractor and a distinctive feature classifier, and can be realized differently. This is based on the different characteristic of the distinctive feature. A score mapper may be included to standardize the output for each DF assessor. Each speech phone can be described as a “bundle” of DFs. The invention is a novel and qualitative solution based on the DF of speech sounds for pronunciation assessment.

    摘要翻译: 提供了基于特征分析的发音评估方法和系统。 它通过一个或多个特征(DF)评估者评估用户的发音。 它还可以用DF评估员进一步构建一个电话评估员来评估用户的电话发音,甚至用手机评估者构建一个连续的语音发音评估者来获得一个单词或一个句子的最终发音分数。 每个DF评估者还包括一个特征提取器和一个独特的特征分类器,并且可以以不同的方式实现。 这是基于独特特征的不同特征。 可以包括分数映射器来标准化每个DF评估者的输出。 每个语音电话都可以被描述为DF的“捆绑”。 本发明是基于用于发音评估的语音声音的DF的新颖且定性的解决方案。

    Pitch shift method with conserved timbre
    7.
    发明授权
    Pitch shift method with conserved timbre 失效
    音调偏移方法具有保守的音色

    公开(公告)号:US5872727A

    公开(公告)日:1999-02-16

    申请号:US752014

    申请日:1996-11-19

    申请人: Chih-Chung Kuo

    发明人: Chih-Chung Kuo

    摘要: An improved method for shifting the pitches of a tone is disclosed. It comprises: (a) subjecting a digitized original waveform to a whitening process using an all-zero filter (AZF) to obtain a whitened waveform; (b) resampling the whitened waveform at a desired scaling ratio to obtain a scaled and whitened waveform; (c) subjecting the scaled and whitened waveform to a coloring process using an all-pole filter (APF) to obtain a synthesized waveform. In a preferred embodiment, the all-zero filter performs the transformation function of: ##EQU1## and the all-pole filter performs the transformation function of: ##EQU2## wherein the a.sub.i 's and b.sub.i 's are linear predictive coefficients. The whitened waveforms can be compressed and stored as wavetables, which can be subsequently retrieved and decompressed before resampling.

    摘要翻译: 公开了一种改变音调音高的方法。 它包括:(a)使用全零滤波器(AZF)对数字化的原始波形进行白化处理以获得白化波形; (b)以期望的比例比对白化的波形进行重采样以获得标度和白化的波形; (c)使用全极滤波器(APF)对经缩放和白化的波形进行着色处理,以获得合成波形。 在一个优选实施例中,全零滤波器执行以下变换函数:< IMAGE>全极滤波器执行以下变换函数:< IMAGE>其中,ai和bi是线性预测系数。 白化的波形可以压缩并存储为波形图,可以在重新采样之前随后检索和解压缩。

    MULTI-LINGUAL TEXT-TO-SPEECH SYSTEM AND METHOD
    8.
    发明申请
    MULTI-LINGUAL TEXT-TO-SPEECH SYSTEM AND METHOD 有权
    多语言文字系统与方法

    公开(公告)号:US20120173241A1

    公开(公告)日:2012-07-05

    申请号:US13217919

    申请日:2011-08-25

    IPC分类号: G10L13/08

    CPC分类号: G10L13/086 G10L13/10

    摘要: A multi-lingual text-to-speech system and method processes a text to be synthesized via an acoustic-prosodic model selection module and an acoustic-prosodic model mergence module, and obtains a phonetic unit transformation table. In an online phase, the acoustic-prosodic model selection module, according to the text and a phonetic unit transcription corresponding to the text, uses at least a set controllable accent weighting parameter to select a transformation combination and find a second and a first acoustic-prosodic models. The acoustic-prosodic model mergence module merges the two acoustic-prosodic models into a merged acoustic-prosodic model, according to the at least a controllable accent weighting parameter, processes all transformations in the transformation combination and generates a merged acoustic-prosodic model sequence. A speech synthesizer and the merged acoustic-prosodic model sequence are further applied to synthesize the text into an L1-accent L2 speech.

    摘要翻译: 多语言文字到语音系统和方法通过声韵律模型选择模块和声韵模型合并模块处理要合成的文本,并获得语音单元变换表。 在在线阶段,根据文本的语音韵律模型选择模块和对应于文本的语音单元转录使用至少一组可控重音加权参数来选择变换组合,并且找到第二和第一声 - 韵律模型。 声韵模型合并模块根据至少一个可控重音加权参数将两个声韵声模型合并成一个合并声韵声模型,处理变换组合中的所有变换,并产生合并的声韵模型序列。 语音合成器和合并声韵声模型序列进一步应用于将文本合成为L1重音L2语音。

    Speech synthesizer generating system and method thereof
    9.
    发明授权
    Speech synthesizer generating system and method thereof 有权
    语音合成器生成系统及其方法

    公开(公告)号:US08055501B2

    公开(公告)日:2011-11-08

    申请号:US11875944

    申请日:2007-10-21

    IPC分类号: G10L13/00

    CPC分类号: G10L13/047

    摘要: A speech synthesizer generating system and a method thereof are provided. A speech synthesizer generator in the speech synthesizer generating system automatically generates a speech synthesizer conforming to a speech output specification input by a user. In addition, a recording script is automatically generated by a recording script generator in the speech synthesizer generating system according to the speech output specification, and a customized or expanded speech material is recorded according to the recording script. After the speech material is uploaded to the speech synthesizer generating system, the speech synthesizer generator automatically generates a speech synthesizer conforming to the speech output specification. The speech synthesizer then synthesizes and outputs a speech output at a user end.

    摘要翻译: 提供语音合成器生成系统及其方法。 语音合成器生成系统中的语音合成器生成器自动生成符合用户输入的语音输出规格的语音合成器。 此外,根据语音输出规格,通过语音合成器生成系统中的记录脚本生成器自动生成记录脚本,并且根据记录脚本记录定制或扩展的语音素材。 在将语音素材上传到语音合成器生成系统之后,语音合成器发生器自动生成符合语音输出规范的语音合成器。 语音合成器然后在用户端合成并输出语音输出。

    Method for speech quality degradation estimation and method for degradation measures calculation and apparatuses thereof
    10.
    发明授权
    Method for speech quality degradation estimation and method for degradation measures calculation and apparatuses thereof 有权
    用于语音质量劣化估计的方法和用于降级测量计算的方法及其装置

    公开(公告)号:US07801725B2

    公开(公告)日:2010-09-21

    申请号:US11427777

    申请日:2006-06-29

    IPC分类号: G10L11/04 G10L13/06

    CPC分类号: G10L25/69

    摘要: A method for speech quality degradation estimation, a method for degradation measures calculation, and the apparatuses thereof are provided. The first method above estimates the speech quality of a speech signal that is modified by a pitch-synchronous prosody modification method, which comprises the following steps. First, extract at least one source pitchmark from the speech signal, and then maps the source pitchmark(s) to at least one target pitchmark(s). Finally, calculate at least one degradation measure based on the mapping between the source and the target pitchmarks. The degradation measures include several weighted pitch-related functions and duration-related functions, where the weighting functions can be calculated based on the speech signal or the pitchmark(s) mapping mentioned above.

    摘要翻译: 提供了一种用于语音质量劣化估计的方法,一种降级测量计算方法及其装置。 上述第一种方法估计由音调同步韵律修改方法修改的语音信号的语音质量,其包括以下步骤。 首先,从语音信号中提取至少一个源间距标记,然后将源间距标记映射到至少一个目标间距标记。 最后,基于源和目标音标之间的映射计算至少一个降级度量。 降级措施包括几个加权音调相关功能和持续时间相关功能,其中可以基于上述的语音信号或音调标记映射来计算加权函数。