METHOD AND APPARATUS FOR SYNTHESIZING A SPEECH WITH INFORMATION
    1.
    发明申请
    METHOD AND APPARATUS FOR SYNTHESIZING A SPEECH WITH INFORMATION 审中-公开
    用于合成语音与信息的方法和装置

    公开(公告)号:US20110166861A1

    公开(公告)日:2011-07-07

    申请号:US12888655

    申请日:2010-09-23

    IPC分类号: G10L13/08

    CPC分类号: G10L13/02 G10L19/018

    摘要: According to one embodiment, an apparatus for synthesizing a speech, comprises an inputting unit configured to input a text sentence, a text analysis unit configured to analyze the text sentence so as to extract linguistic information, a parameter generation unit configured to generate a speech parameter by using the linguistic information and a pre-trained statistical parameter model, an embedding unit configured to embed information into the speech parameter, and a speech synthesis unit configured to synthesize the speech parameter with the information embedded by the embedding unit into a speech with the information.

    摘要翻译: 根据一个实施例,一种用于合成语音的装置,包括被配置为输入文本句子的输入单元,被配置为分析文本语句以便提取语言信息的文本分析单元,被配置为生成语音参数的参数生成单元 通过使用语言信息和预先训练的统计参数模型,嵌入单元,被配置为将信息嵌入到语音参数中;以及语音合成单元,被配置为将语音参数与嵌入单元嵌入的信息合成到具有 信息。

    METHOD AND APPARATUS FOR FUSING VOICED PHONEME UNITS IN TEXT-TO-SPEECH
    2.
    发明申请
    METHOD AND APPARATUS FOR FUSING VOICED PHONEME UNITS IN TEXT-TO-SPEECH 审中-公开
    用于在语音中填充声音单元的方法和装置

    公开(公告)号:US20110320199A1

    公开(公告)日:2011-12-29

    申请号:US13183667

    申请日:2011-07-15

    申请人: Jian Luan Jian Li

    发明人: Jian Luan Jian Li

    IPC分类号: G10L15/26

    CPC分类号: G10L13/06

    摘要: According to one embodiment, an apparatus for fusing voiced phoneme units in Text-To-Speech, includes a reference unit selection module configured to select a reference unit from the plurality of units based on pitch cycle information of the each unit and the number of pitch cycles of the target segment. The apparatus includes a template creation module configured to create a template based on the reference unit selected by the reference unit selection module and the number of pitch cycles of the target segment, wherein the number of pitch cycles of the template is same with that of pitch cycles of the target segment. The apparatus includes a pitch cycle alignment module configured to align pitch cycles of each unit of the plurality of units except the reference unit with pitch cycles of the template by using a dynamic programming algorithm.

    摘要翻译: 根据一个实施例,一种用于在文本到语音中融合浊音音素单元的装置包括:参考单元选择模块,被配置为基于每个单元的音调周期信息和音调数量从多个单元中选择参考单元 目标段的周期。 该装置包括:模板创建模块,被配置为基于由参考单元选择模块选择的参考单元和目标段的音调周期数来创建模板,其中模板的音调周期数与节距的相同 目标段的周期。 该装置包括音调周期对准模块,其被配置为通过使用动态规划算法来将参考单元之外的多个单元的每个单元的音调周期与模板的音调周期对准。

    Method and apparatus for verification of speaker authentication
    3.
    发明授权
    Method and apparatus for verification of speaker authentication 有权
    用于验证扬声器认证的方法和装置

    公开(公告)号:US07809561B2

    公开(公告)日:2010-10-05

    申请号:US11692470

    申请日:2007-03-28

    申请人: Jian Luan Jie Hao

    发明人: Jian Luan Jie Hao

    IPC分类号: G10L15/12

    CPC分类号: G10L17/08 G10L17/24

    摘要: The present invention provides a method and apparatus for verification of speaker authentication. A method for verification of speaker authentication, comprising: inputting an utterance containing a password that is spoken by a speaker; extracting an acoustic feature vector sequence from said inputted utterance; DTW-matching said extracted acoustic feature vector sequence and a speaker template enrolled by an enrolled speaker; calculating each of a plurality of local distances between said DTW-matched acoustic feature vector sequence and said speaker template; nonlinear-transforming said each local distance calculated to give more weights on small local distances; calculating a DTW-matching score based on said plurality of local distances nonlinear-transformed; and comparing said matching score with a predefined discriminating threshold to determine whether said inputted utterance is an utterance containing a password spoken by the enrolled speaker.

    摘要翻译: 本发明提供了用于验证扬声器认证的方法和装置。 一种用于验证扬声器认证的方法,包括:输入包含由扬声器说出的口令的话语; 从所述输入的话语中提取声学特征向量序列; DTW匹配所述提取的声学特征向量序列和由注册的说话者登记的说话者模板; 计算所述DTW匹配的声学特征向量序列和所述说话者模板之间的多个局部距离中的每一个; 非线性变换表示计算的每个局部距离,以便在较小的局部距离上给出更多的权重; 基于所述多个局部距离非线性变换来计算DTW匹配分数; 以及将所述匹配分数与预定义的识别阈值进行比较,以确定所述输入的话语是否是包含由所登记的说话者所说的口令的话语。

    METHOD AND APPARATUS FOR COMPRESSING A SPEAKER TEMPLATE, METHOD AND APPARATUS FOR MERGING A PLURALITY OF SPEAKER TEMPLATES, AND SPEAKER AUTHENTICATION
    4.
    发明申请
    METHOD AND APPARATUS FOR COMPRESSING A SPEAKER TEMPLATE, METHOD AND APPARATUS FOR MERGING A PLURALITY OF SPEAKER TEMPLATES, AND SPEAKER AUTHENTICATION 审中-公开
    用于压缩扬声器模板的方法和装置,用于合并多个扬声器模板的方法和装置以及演讲者的认证

    公开(公告)号:US20070129944A1

    公开(公告)日:2007-06-07

    申请号:US11550533

    申请日:2006-10-18

    申请人: Jian Luan Jie Hao

    发明人: Jian Luan Jie Hao

    IPC分类号: G10L17/00

    CPC分类号: G10L17/04

    摘要: The present invention provides a method and apparatus for compressing a speaker template, a method and apparatus for merging a plurality of speaker templates, a method and apparatus for enrollment and verification of speaker authentication, a system for speaker authentication. Said method for compressing a speaker template that includes a plurality of feature vectors, comprising: designating a code to each of said plurality of feature vectors in said speaker template according to a codebook that includes a plurality of codes and their corresponding feature codes; and replacing a plurality of adjacent feature vectors designated with the same code in the speaker template with a feature vector.

    摘要翻译: 本发明提供了一种用于压缩扬声器模板的方法和装置,用于合并多个扬声器模板的方法和装置,用于登记和验证扬声器认证的方法和装置,用于说话者认证的系统。 用于压缩包括多个特征向量的扬声器模板的所述方法包括:根据包括多个代码及其对应的特征码的码本,将所述扬声器模板中的所述多个特征矢量中的每一个指定代码; 以及用所述特征向量替换所述扬声器模板中用相同码指定的多个相邻特征矢量。

    DYNAMIC LONG-DISTANCE DEPENDENCY WITH CONDITIONAL RANDOM FIELDS
    5.
    发明申请
    DYNAMIC LONG-DISTANCE DEPENDENCY WITH CONDITIONAL RANDOM FIELDS 有权
    动态长距离依赖于条件随机场

    公开(公告)号:US20130262105A1

    公开(公告)日:2013-10-03

    申请号:US13433186

    申请日:2012-03-28

    IPC分类号: G10L15/26

    摘要: Dynamic features are utilized with CRFs to handle long-distance dependencies of output labels. The dynamic features present a probability distribution involved in explicit distance from/to a special output label that is pre-defined according to each application scenario. Besides the number of units in the segment (from the previous special output label to the current unit), the dynamic features may also include the sum of any basic features of units in the segment. Since the added dynamic features are involved in the distance from the previous specific label, the searching lattice associated with Viterbi searching is expanded to distinguish the nodes with various distances. The dynamic features may be used in a variety of different applications, such as Natural Language Processing, Text-To-Speech and Automatic Speech Recognition. For example, the dynamic features may be used to assist in prosodic break and pause prediction.

    摘要翻译: CRF利用动态特征来处理输出标签的长距离依赖关系。 动态特征呈现出根据每个应用场景预定义的特定输出标签的显式距离所涉及的概率分布。 除了段中的单位数(从前一个特殊输出标签到当前单位),动态特征还可以包括段中单位的任何基本特征的总和。 由于添加的动态特征涉及到与先前特定标签的距离,因此扩展了与维特比搜索相关联的搜索点,以区分不同距离的节点。 动态特征可用于各种不同的应用,如自然语言处理,文本到语音和自动语音识别。 例如,动态特征可以用于辅助韵律休息和暂停预测。

    METHOD AND APPARATUS FOR ENROLLMENT AND EVALUATION OF SPEAKER AUTHENTIFICATION
    6.
    发明申请
    METHOD AND APPARATUS FOR ENROLLMENT AND EVALUATION OF SPEAKER AUTHENTIFICATION 有权
    声音识别的加密和评估方法与装置

    公开(公告)号:US20080082331A1

    公开(公告)日:2008-04-03

    申请号:US11859358

    申请日:2007-09-21

    申请人: Jian Luan Jie Hao

    发明人: Jian Luan Jie Hao

    IPC分类号: G10L15/00

    CPC分类号: G10L17/04

    摘要: The present invention provides a method and apparatus for enrollment and evaluation of speaker authentication. The method for enrollment of speaker authentication, comprising: generating a plurality of acoustic feature vector sequences respectively based on a plurality of utterances of the same content spoken by a speaker; generating a reference template from said plurality of acoustic feature vector sequences; generating a corresponding pseudo-impostor feature vector sequence for each of said plurality of acoustic feature vector sequences based on a code book that includes a plurality of codes and their corresponding feature vectors; and selecting an optimal acoustic feature subset based on said plurality of acoustic feature vector sequences, said reference template and said plurality of pseudo-impostor feature vector sequences.

    摘要翻译: 本发明提供了一种用于注册和评估扬声器认证的方法和装置。 用于注册说话人认证的方法,包括:分别基于由说话者说出的相同内容的多个话语产生多个声学特征向量序列; 从所述多个声学特征向量序列生成参考模板; 基于包括多个代码及其对应的特征向量的代码簿,为所述多个声学特征向量序列中的每一个产生相应的伪冒号特征向量序列; 以及基于所述多个声学特征向量序列,所述参考模板和所述多个伪冒号特征向量序列来选择最佳声学特征子集。

    Dynamic long-distance dependency with conditional random fields
    7.
    发明授权
    Dynamic long-distance dependency with conditional random fields 有权
    动态长距离依赖条件随机场

    公开(公告)号:US09037460B2

    公开(公告)日:2015-05-19

    申请号:US13433186

    申请日:2012-03-28

    摘要: Dynamic features are utilized with CRFs to handle long-distance dependencies of output labels. The dynamic features present a probability distribution involved in explicit distance from/to a special output label that is pre-defined according to each application scenario. Besides the number of units in the segment (from the previous special output label to the current unit), the dynamic features may also include the sum of any basic features of units in the segment. Since the added dynamic features are involved in the distance from the previous specific label, the searching lattice associated with Viterbi searching is expanded to distinguish the nodes with various distances. The dynamic features may be used in a variety of different applications, such as Natural Language Processing, Text-To-Speech and Automatic Speech Recognition. For example, the dynamic features may be used to assist in prosodic break and pause prediction.

    摘要翻译: CRF利用动态特征来处理输出标签的长距离依赖性。 动态特征呈现出根据每个应用场景预定义的特定输出标签的显式距离所涉及的概率分布。 除了段中的单位数(从前一个特殊输出标签到当前单位),动态特征还可以包括段中单位的任何基本特征的总和。 由于添加的动态特征涉及到与先前特定标签的距离,因此扩展了与维特比搜索相关联的搜索点,以区分不同距离的节点。 动态特征可用于各种不同的应用,如自然语言处理,文本到语音和自动语音识别。 例如,动态特征可以用于辅助韵律休息和暂停预测。

    Method and apparatus for enrollment and verification of speaker authentication
    8.
    发明授权
    Method and apparatus for enrollment and verification of speaker authentication 失效
    扬声器认证注册和验证的方法和装置

    公开(公告)号:US07877254B2

    公开(公告)日:2011-01-25

    申请号:US11692397

    申请日:2007-03-28

    IPC分类号: G10L17/00 G10L19/00 G10L15/06

    CPC分类号: G10L17/04

    摘要: The present invention provides a method and apparatus for enrollment and verification of speaker authentication. The method for enrollment of speaker authentication, comprising: extracting an acoustic feature vector sequence from an enrollment utterance of a speaker; and generating a speaker template using the acoustic feature vector sequence; wherein said step of extracting an acoustic feature vector sequence comprises: generating a filter-bank for the enrollment utterance of the speaker for filtering locations and energies of formants in the spectrum of the enrollment utterance based on the enrollment utterance; filtering the spectrum of the enrollment utterance by the generated filter-bank; and generating the acoustic feature vector sequence from the filtered enrollment utterance.

    摘要翻译: 本发明提供了一种用于注册和验证扬声器认证的方法和装置。 用于注册说话人认证的方法,包括:从扬声器的登记话音中提取声学特征向量序列; 以及使用所述声学特征向量序列生成扬声器模板; 其中所述提取声学特征向量序列的步骤包括:基于所述登记话语,生成用于所述扬声器的登记话语的滤波器组,用于过滤所述登记话音频谱中的共振峰的位置和能量; 通过生成的滤波器组过滤入场语音的频谱; 以及从所述滤波的登记话音生成所述声学特征向量序列。

    METHOD AND APPARATUS FOR ESTIMATING DISCRIMINATING ABILITY OF A SPEECH, METHOD AND APPARATUS FOR ENROLLMENT AND EVALUATION OF SPEAKER AUTHENTICATION
    9.
    发明申请
    METHOD AND APPARATUS FOR ESTIMATING DISCRIMINATING ABILITY OF A SPEECH, METHOD AND APPARATUS FOR ENROLLMENT AND EVALUATION OF SPEAKER AUTHENTICATION 审中-公开
    用于评估语音识别能力的方法和装置,方法和装置,用于演讲者认证的评估和评估

    公开(公告)号:US20070124145A1

    公开(公告)日:2007-05-31

    申请号:US11550525

    申请日:2006-10-18

    申请人: Jian Luan Jie Hao

    发明人: Jian Luan Jie Hao

    IPC分类号: G10L15/04

    CPC分类号: G10L17/04

    摘要: The present invention provides a method and apparatus for enrollment and evaluation of speaker authentication, a method for estimating discriminating ability of a speech, and a system for speaker authentication. A method for enrollment of speaker authentication, comprising: inputting a speech containing a password that is spoken by a speaker; obtaining a phoneme sequence from said inputted speech; estimating discriminating ability of the phoneme sequence based on a discriminating ability table that includes a discriminating ability for each phoneme; setting a discriminating threshold for said speech; and generating a speech template for said speech.

    摘要翻译: 本发明提供了一种用于说话者认证的注册和评估的方法和装置,用于估计语音的辨别能力的方法和用于说话者认证的系统。 一种用于注册扬声器认证的方法,包括:输入包含由扬声器说出的密码的语音; 从所述输入的语音中获取音素序列; 基于包括每个音素的辨别能力的识别能力表,估计音素序列的辨别能力; 为所述语音设定鉴别阈值; 以及为所述语音生成语音模板。

    METHOD AND APPARATUS FOR ENROLLMENT AND VERIFICATION OF SPEAKER AUTHENTICATION
    10.
    发明申请
    METHOD AND APPARATUS FOR ENROLLMENT AND VERIFICATION OF SPEAKER AUTHENTICATION 失效
    声音认证的加密和验证方法和设备

    公开(公告)号:US20070239451A1

    公开(公告)日:2007-10-11

    申请号:US11692397

    申请日:2007-03-28

    IPC分类号: G10L17/00

    CPC分类号: G10L17/04

    摘要: The present invention provides a method and apparatus for enrollment and verification of speaker authentication. The method for enrollment of speaker authentication, comprising: extracting an acoustic feature vector sequence from an enrollment utterance of a speaker; and generating a speaker template using the acoustic feature vector sequence; wherein said step of extracting an acoustic feature vector sequence comprises: generating a filter-bank for the enrollment utterance of the speaker for filtering locations and energies of formants in the spectrum of the enrollment utterance based on the enrollment utterance; filtering the spectrum of the enrollment utterance by the generated filter-bank; and generating the acoustic feature vector sequence from the filtered enrollment utterance.

    摘要翻译: 本发明提供了一种用于注册和验证扬声器认证的方法和装置。 用于注册说话人认证的方法,包括:从扬声器的登记话音中提取声学特征向量序列; 以及使用所述声学特征向量序列生成扬声器模板; 其中所述提取声学特征向量序列的步骤包括:基于所述登记话语,生成用于所述扬声器的登记话语的滤波器组,用于过滤所述登记话音频谱中的共振峰的位置和能量; 通过生成的滤波器组过滤入场语音的频谱; 以及从所述滤波的登记话音生成所述声学特征向量序列。