System and method of word lattice augmentation using a pre/post vocalic consonant distinction
    1.
    发明授权
    System and method of word lattice augmentation using a pre/post vocalic consonant distinction 有权
    使用前/后声乐辅音区分的词格增强的系统和方法

    公开(公告)号:US08024191B2

    公开(公告)日:2011-09-20

    申请号:US11930999

    申请日:2007-10-31

    IPC分类号: G10L15/04

    CPC分类号: G10L25/78 G10L15/02

    摘要: Systems and methods are provided for recognizing speech in a spoken dialogue system. The method includes receiving input speech having a pre-vocalic consonant or a post-vocalic consonant, generating at least one output lattice that calculates a first score by comparing the input speech to a training model to provide a result and distinguishing between the pre-vocalic consonant and the post-vocalic consonant in the input speech. A second score is calculated by measuring a similarity between the pre-vocalic consonant or the post vocalic consonant in the input speech and the first score. At least one category is determined for the pre-vocalic match or mismatch or the post-vocalic match or mismatch by using the second score and the results of the an automated speech recognition (ASR) system are refined by using the at least one category for the pre-vocalic match or mismatch or the post-vocalic match or mismatch.

    摘要翻译: 提供了系统和方法来识别语音对话系统中的语音。 该方法包括接收具有声前辅音或声后辅音的输入语音,通过将输入的语音与训练模型进行比较来产生至少一个输出格数,该输出格式通过比较输入语音来提供结果并区分前语音 辅音和语音后辅音。 通过测量输入语音中的声前辅音或声音后辅音与第一分数之间的相似度来计算第二分。 通过使用第二分数来确定至少一个类别,用于通过使用第二分数进行语前匹配或不匹配或者后声匹配或不匹配,并且通过使用至少一个类别对自动语音识别(ASR)系统的结果进行改进, 前声匹配或不匹配或后声匹配或不匹配。

    SYSTEM AND METHOD OF WORD LATTICE AUGMENTATION USING A PRE/POST VOCALIC CONSONANT DISTINCTION
    2.
    发明申请
    SYSTEM AND METHOD OF WORD LATTICE AUGMENTATION USING A PRE/POST VOCALIC CONSONANT DISTINCTION 有权
    使用前任/后期职业协商决定的字幕扩展的系统和方法

    公开(公告)号:US20090112591A1

    公开(公告)日:2009-04-30

    申请号:US11930999

    申请日:2007-10-31

    IPC分类号: G10L15/00

    CPC分类号: G10L25/78 G10L15/02

    摘要: Disclosed are systems and methods for recognizing speech in a spoken dialogue system. The method includes (1) receiving an input speech having at least one pre-vocalic consonant or at least one post-vocalic consonant, (2) generating at least one output lattice that calculates a first score by comparing the input speech to a training model to provide a result; (3) distinguishing between the at least one pre-vocalic consonant and the at least one post-vocalic consonant in the input speech, (4) calculating a second score by measuring a similarity between the at least one pre-vocalic consonant or the at least one post vocalic consonant in the input speech and the first score, (5) determining at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch by using the second score, and (6) refining the results of the an automated speech recognition (ASR) system by using the at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch.

    摘要翻译: 公开了用于在口头对话系统中识别语音的系统和方法。 该方法包括(1)接收具有至少一个声前辅音或至少一个声后辅音的输入语音,(2)通过将输入的语音与训练模型进行比较来产生计算第一分数的至少一个输出格 提供结果; (3)在所述输入语音中区分所述至少一个声前辅音和所述至少一个声后辅音,(4)通过测量所述至少一个声前辅音或所述至少一个声前辅音之间的相似度来计算第二分数 输入语音和第一分数中的至少一个声音辅音,(5)通过使用第二分数来确定至少一个人声前匹配或不匹配或至少一个后声匹配或不匹配的至少一个类别,以及( 6)通过使用至少一个类别进行至少一个声前匹配或不匹配或至少一个后声匹配或不匹配,来改进自动语音识别(ASR)系统的结果。

    SYSTEM AND METHOD OF USING ACOUSTIC MODELS FOR AUTOMATIC SPEECH RECOGNITION WHICH DISTINGUISH PRE- AND POST-VOCALIC CONSONANTS
    3.
    发明申请
    SYSTEM AND METHOD OF USING ACOUSTIC MODELS FOR AUTOMATIC SPEECH RECOGNITION WHICH DISTINGUISH PRE- AND POST-VOCALIC CONSONANTS 有权
    用于自动语音识别的声学模型的系统和方法,用于识别前后职业

    公开(公告)号:US20090112594A1

    公开(公告)日:2009-04-30

    申请号:US11930675

    申请日:2007-10-31

    IPC分类号: G10L15/00

    CPC分类号: G10L25/78 G10L15/02

    摘要: Disclosed are systems, methods and computer readable media for training acoustic models for an automatic speech recognition systems (ASR) system. The method includes receiving a speech signal, defining at least one syllable boundary position in the received speech signal, based on the at least one syllable boundary position, generating for each consonant in a consonant phoneme inventory a pre-vocalic position label and a post-vocalic position label to expand the consonant phoneme inventory, reformulating a lexicon to reflect an expanded consonant phoneme inventory, and training a language model for an automated speech recognition (ASR) system based on the reformulated lexicon.

    摘要翻译: 公开了用于训练用于自动语音识别系统(ASR)系统的声学模型的系统,方法和计算机可读介质。 该方法包括基于所述至少一个音节边界位置接收定义接收到的语音信号中的至少一个音节边界位置的语音信号,在辅音音素库中为每个辅音生成声前位置标签和后声音位置标签, 声音位置标签,以扩展辅音音素库存,重新设计词典,以反映扩展的辅音音素库存,并为基于重新设计的词典的自动语音识别(ASR)系统培训语言模型。

    System and method of using acoustic models for automatic speech recognition which distinguish pre- and post-vocalic consonants
    4.
    发明授权
    System and method of using acoustic models for automatic speech recognition which distinguish pre- and post-vocalic consonants 有权
    用于自动语音识别的声学模型的系统和方法,其区分声前和后声辅音

    公开(公告)号:US08015008B2

    公开(公告)日:2011-09-06

    申请号:US11930675

    申请日:2007-10-31

    IPC分类号: G10L15/04

    CPC分类号: G10L25/78 G10L15/02

    摘要: Disclosed are systems, methods and computer readable media for training acoustic models for an automatic speech recognition systems (ASR) system. The method includes receiving a speech signal, defining at least one syllable boundary position in the received speech signal, based on the at least one syllable boundary position, generating for each consonant in a consonant phoneme inventory a pre-vocalic position label and a post-vocalic position label to expand the consonant phoneme inventory, reformulating a lexicon to reflect an expanded consonant phoneme inventory, and training a language model for an automated speech recognition (ASR) system based on the reformulated lexicon.

    摘要翻译: 公开了用于训练用于自动语音识别系统(ASR)系统的声学模型的系统,方法和计算机可读介质。 该方法包括基于所述至少一个音节边界位置接收定义接收到的语音信号中的至少一个音节边界位置的语音信号,在辅音音素库中为每个辅音生成声前位置标签和后声音位置标签, 声音位置标签,以扩展辅音音素库存,重新设计词典,以反映扩展的辅音音素库存,并为基于重新设计的词典的自动语音识别(ASR)系统培训语言模型。

    PHONETICALLY ENRICHED LABELING IN UNIT SELECTION SPEECH SYNTHESIS
    5.
    发明申请
    PHONETICALLY ENRICHED LABELING IN UNIT SELECTION SPEECH SYNTHESIS 审中-公开
    在单元选择语音合成中的电话强化标签

    公开(公告)号:US20080077407A1

    公开(公告)日:2008-03-27

    申请号:US11535146

    申请日:2006-09-26

    IPC分类号: G10L13/00

    CPC分类号: G10L13/06 G10L13/08

    摘要: A system, method and computer-readable media are disclosed for improving speech synthesis. A text-to-speech (TTS) voice database for use in a TTS system is generated by a method comprising labeling a voice database phonemically and applying a pre-/post-vocalic distinction to the phonemic labels to generate a TTS voice database. When a system synthesizes speech using speech units from the TTS voice database, the database provides phonemes for selection using the pre-/post-vocalic distinctions which improve unit selection to render the synthetic speech more natural.

    摘要翻译: 公开了用于改进语音合成的系统,方法和计算机可读介质。 用于TTS系统的文本到语音(TTS)语音数据库通过一种方法产生,该方法包括以语音的方式标注语音数据库,并且将语音前/后的区别应用于音素标签以产生TTS语音数据库。 当系统使用来自TTS语音数据库的语音单元来合成语音时,数据库使用前/后声部区分提供用于选择的音素,这改进了单元选择以使合成语音更自然。

    System and method for pronunciation modeling
    6.
    发明授权
    System and method for pronunciation modeling 有权
    发音建模的系统和方法

    公开(公告)号:US08862470B2

    公开(公告)日:2014-10-14

    申请号:US13302380

    申请日:2011-11-22

    IPC分类号: G10L15/187 G10L15/183

    摘要: Systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.

    摘要翻译: 系统,计算机实现的方法和用于生成发音模型的有形计算机可读介质。 该方法包括识别由音素组成的通用语音模型,在通用语音模型中识别音素的可互换音素替代品系列,将可互换音素替代品的家族标记为指相同的音素,以及生成发音模型,其中 将每个家庭的每个音素替代。 在一个方面,语音的通用模型是声道长度归一化声学模型。 可互换的音素替代品可以代表不同方言课程的相同音素。 可互换的音素替代品可以包括一串音素。

    SYSTEM AND METHOD FOR PRONUNCIATION MODELING
    7.
    发明申请
    SYSTEM AND METHOD FOR PRONUNCIATION MODELING 有权
    发明建模系统与方法

    公开(公告)号:US20100145707A1

    公开(公告)日:2010-06-10

    申请号:US12328407

    申请日:2008-12-04

    IPC分类号: G10L13/06

    摘要: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.

    摘要翻译: 本文公开了用于生成发音模型的系统,计算机实现的方法和有形的计算机可读介质。 该方法包括识别由音素组成的通用语音模型,在通用语音模型中识别音素的可互换音素替代品系列,将可互换音素替代品的家族标记为指相同的音素,以及生成发音模型,其中 将每个家庭的每个音素替代。 在一个方面,语音的通用模型是声道长度归一化声学模型。 可互换的音素替代品可以代表不同方言课程的相同音素。 可互换的音素替代品可以包括一串音素。

    System and method for speech personalization by need
    9.
    发明授权
    System and method for speech personalization by need 有权
    需要语音个性化的系统和方法

    公开(公告)号:US09002713B2

    公开(公告)日:2015-04-07

    申请号:US12480864

    申请日:2009-06-09

    摘要: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions. The method can further store a speaker personalization profile having information for the modified set of allocated resources and recognize speech associated with the speaker based on the speaker personalization profile.

    摘要翻译: 这里公开了用于说话人识别个性化的系统,计算机实现的方法和有形的计算机可读存储介质。 该方法使用一组分配的资源来识别从与语音接口交互的扬声器接收的语音,所分配的资源的集合包括带宽,处理器时间,存储器和存储。 该方法记录与识别的语音相关联的度量,并且在记录度量之后,修改与记录的度量相称的所分配资源集合中的所分配的资源中的至少一个。 该方法使用经修改的分配资源集来识别来自扬声器的附加语音。 指标可以包括语音识别置信度分数,处理速度,对话行为,重复请求,对确认的否定响应以及任务完成。 该方法还可以存储具有用于所修改的分配资源集合的信息的扬声器个性化简档,并且基于说话者个性化简档识别与说话者相关联的语音。

    System and method for pronunciation modeling
    10.
    发明授权
    System and method for pronunciation modeling 有权
    发音建模的系统和方法

    公开(公告)号:US08073693B2

    公开(公告)日:2011-12-06

    申请号:US12328407

    申请日:2008-12-04

    IPC分类号: G10L15/02

    摘要: Systems, computer-implemented methods, and tangible computer-readable media for generating a pronunciation model. The method includes identifying a generic model of speech composed of phonemes, identifying a family of interchangeable phonemic alternatives for a phoneme in the generic model of speech, labeling the family of interchangeable phonemic alternatives as referring to the same phoneme, and generating a pronunciation model which substitutes each family for each respective phoneme. In one aspect, the generic model of speech is a vocal tract length normalized acoustic model. Interchangeable phonemic alternatives can represent a same phoneme for different dialectal classes. An interchangeable phonemic alternative can include a string of phonemes.

    摘要翻译: 系统,计算机实现的方法和用于生成发音模型的有形计算机可读介质。 该方法包括识别由音素组成的通用语音模型,在通用语音模型中识别音素的可互换音素替代品系列,将可互换音素替代品的家族标记为指相同的音素,以及生成发音模型,其中 将每个家庭的每个音素替代。 在一个方面,语音的通用模型是声道长度归一化声学模型。 可互换的音素替代品可以代表不同方言课程的相同音素。 可互换的音素替代品可以包括一串音素。