Utilizing features generated from phonic units in speech recognition
    1.
    发明授权
    Utilizing features generated from phonic units in speech recognition 有权
    利用语音单元产生的特征进行语音识别

    公开(公告)号:US08401852B2

    公开(公告)日:2013-03-19

    申请号:US12626943

    申请日:2009-11-30

    IPC分类号: G10L15/04

    CPC分类号: G10L15/10 G10L15/02

    摘要: A computer-implemented speech recognition system described herein includes a receiver component that receives a plurality of detected units of an audio signal, wherein the audio signal comprises a speech utterance of an individual. A selector component selects a subset of the plurality of detected units that correspond to a particular time-span. A generator component generates at least one feature with respect to the particular time-span, wherein the at least one feature is one of an existence feature, an expectation feature, or an edit distance feature. Additionally, a statistical speech recognition model outputs at least one word that corresponds to the particular time-span based at least in part upon the at least one feature generated by the feature generator component.

    摘要翻译: 本文描述的计算机实现的语音识别系统包括接收组件,其接收多个检测到的音频信号的单元,其中该音频信号包括个人的讲话语音。 选择器部件选择对应于特定时间跨度的多个检测单元的子集。 发生器组件相对于特定时间跨度产生至少一个特征,其中所述至少一个特征是存在特征,期望特征或编辑距离特征之一。 另外,统计语音识别模型至少部分地基于由特征生成器组件生成的至少一个特征来输出对应于特定时间跨度的至少一个单词。

    FEATURES FOR UTILIZATION IN SPEECH RECOGNITION
    2.
    发明申请
    FEATURES FOR UTILIZATION IN SPEECH RECOGNITION 有权
    语音识别中的使用特征

    公开(公告)号:US20110131046A1

    公开(公告)日:2011-06-02

    申请号:US12626943

    申请日:2009-11-30

    IPC分类号: G10L15/04

    CPC分类号: G10L15/10 G10L15/02

    摘要: A computer-implemented speech recognition system described herein includes a receiver component that receives a plurality of detected units of an audio signal, wherein the audio signal comprises a speech utterance of an individual. A selector component selects a subset of the plurality of detected units that correspond to a particular time-span. A generator component generates at least one feature with respect to the particular time-span, wherein the at least one feature is one of an existence feature, an expectation feature, or an edit distance feature. Additionally, a statistical speech recognition model outputs at least one word that corresponds to the particular time-span based at least in part upon the at least one feature generated by the feature generator component.

    摘要翻译: 本文描述的计算机实现的语音识别系统包括接收组件,其接收多个检测到的音频信号的单元,其中该音频信号包括个人的讲话语音。 选择器部件选择对应于特定时间跨度的多个检测单元的子集。 发生器组件相对于特定时间跨度产生至少一个特征,其中所述至少一个特征是存在特征,期望特征或编辑距离特征之一。 另外,统计语音识别模型至少部分地基于由特征生成器组件生成的至少一个特征来输出对应于特定时间跨度的至少一个单词。

    AUTOMATIC SPEECH RECOGNITION BASED UPON INFORMATION RETRIEVAL METHODS
    3.
    发明申请
    AUTOMATIC SPEECH RECOGNITION BASED UPON INFORMATION RETRIEVAL METHODS 审中-公开
    基于信息检索方法的自动语音识别

    公开(公告)号:US20110224982A1

    公开(公告)日:2011-09-15

    申请号:US12722556

    申请日:2010-03-12

    IPC分类号: G10L15/02

    CPC分类号: G10L15/08 G10L2015/025

    摘要: Described is a technology in which information retrieval (IR) techniques are used in a speech recognition (ASR) system. Acoustic units (e.g., phones, syllables, multi-phone units, words and/or phrases) are decoded, and features found from those acoustic units. The features are then used with IR techniques (e.g., TF-IDF based retrieval) to obtain a target output (a word or words). Also described is the use of IR techniques to provide a full large vocabulary continuous speech (LVCSR) recognizer

    摘要翻译: 描述了在语音识别(ASR)系统中使用信息检索(IR)技术的技术。 声学单元(例如,电话,音节,多电话单元,单词和/或短语)被解码,并且从那些声学单元找到的特征。 然后将特征与IR技术(例如,基于TF-IDF的检索)一起使用以获得目标输出(一个或多个单词)。 还描述了使用IR技术来提供完整的大词汇连续语音(LVCSR)识别器

    Spoken utterance classification training for a speech recognition system
    4.
    发明授权
    Spoken utterance classification training for a speech recognition system 有权
    语音识别系统讲话分类训练

    公开(公告)号:US09082403B2

    公开(公告)日:2015-07-14

    申请号:US13326659

    申请日:2011-12-15

    IPC分类号: G10L15/00 G10L15/18

    CPC分类号: G10L15/1822

    摘要: The subject disclosure is directed towards training a classifier for spoken utterances without relying on human-assistance. The spoken utterances may be related to a voice menu program for which a speech comprehension component interprets the spoken utterances into voice menu options. The speech comprehension component provides confirmations to some of the spoken utterances in order to accurately assign a semantic label. For each spoken utterance with a denied confirmation, the speech comprehension component automatically generates a pseudo-semantic label that is consistent with the denied confirmation and selected from a set of potential semantic labels and updates a classification model associated with the classifier using the pseudo-semantic label.

    摘要翻译: 主题披露旨在培训用于讲话的分类器,而不依赖人力援助。 讲话话语可能与语音菜单程序相关,语音理解组件将语音话语解释成语音菜单选项。 语音理解组件为一些语音语音提供了确认,以便准确地分配语义标签。 对于每个具有拒绝确认的口语说话,语音理解组件自动生成与拒绝确认一致的伪语义标签,并从一组潜在语义标签中选择,并使用伪语义更新与分类器相关联的分类模型 标签。

    Spoken Utterance Classification Training for a Speech Recognition System
    5.
    发明申请
    Spoken Utterance Classification Training for a Speech Recognition System 有权
    语音识别系统的语音分类训练

    公开(公告)号:US20130159000A1

    公开(公告)日:2013-06-20

    申请号:US13326659

    申请日:2011-12-15

    IPC分类号: G10L15/04

    CPC分类号: G10L15/1822

    摘要: The subject disclosure is directed towards training a classifier for spoken utterances without relying on human-assistance. The spoken utterances may be related to a voice menu program for which a speech comprehension component interprets the spoken utterances into voice menu options. The speech comprehension component provides confirmations to some of the spoken utterances in order to accurately assign a semantic label. For each spoken utterance with a denied confirmation, the speech comprehension component automatically generates a pseudo-semantic label that is consistent with the denied confirmation and selected from a set of potential semantic labels and updates a classification model associated with the classifier using the pseudo-semantic label.

    摘要翻译: 主题披露旨在培训用于讲话的分类器,而不依赖人力援助。 讲话话语可能与语音菜单程序相关,语音理解组件将语音话语解释成语音菜单选项。 语音理解组件为一些语音语音提供了确认,以便准确地分配语义标签。 对于每个具有拒绝确认的口语说话,语音理解组件自动生成与拒绝确认一致的伪语义标签,并从一组潜在语义标签中选择,并使用伪语义更新与分类器相关联的分类模型 标签。

    Using Utterance Classification in Telephony and Speech Recognition Applications
    6.
    发明申请
    Using Utterance Classification in Telephony and Speech Recognition Applications 审中-公开
    在电话和语音识别应用中使用语音分类

    公开(公告)号:US20110307252A1

    公开(公告)日:2011-12-15

    申请号:US12815419

    申请日:2010-06-15

    IPC分类号: G10L15/08

    CPC分类号: G10L15/1822

    摘要: Described is the use of utterance classification based methods and other machine learning techniques to provide a telephony application or other voice menu application (e.g., an automotive application) that need not use Context-Free-Grammars to determine a user's spoken intent. A classifier receives text from an information retrieval-based speech recognizer and outputs a semantic label corresponding to the likely intent of a user's speech. The semantic label is then output, such as for use by a voice menu program in branching between menus. Also described is training, including training the language model from acoustic data without transcriptions, and training the classifier from speech-recognized acoustic data having associated semantic labels.

    摘要翻译: 描述了使用基于话语分类的方法和其他机器学习技术来提供不需要使用上下文自由语法来确定用户的口语意图的电话应用或其他语音菜单应用(例如,汽车应用)。 分类器从基于信息检索的语音识别器接收文本,并输出与用户言语的可能意图对应的语义标签。 然后输出语义标签,例如由菜单之间分支的语音菜单程序使用。 还描述了训练,包括从没有转录的声学数据训练语言模型,以及从具有相关联的语义标签的语音识别的声学数据训练分类器。