Method for likelihood computation in multi-stream HMM based speech recognition
    1.
    发明授权
    Method for likelihood computation in multi-stream HMM based speech recognition 有权
    基于多流HMM语音识别的似然计算方法

    公开(公告)号:US07480617B2

    公开(公告)日:2009-01-20

    申请号:US10946381

    申请日:2004-09-21

    IPC分类号: G10L15/14

    CPC分类号: G10L15/144

    摘要: A method for speech recognition includes determining active Gaussians related to a first feature stream and a second feature stream by labeling at least one of the first and second streams, and determining active Gaussians co-occurring in the first stream and the second stream based upon joint probability. A number of Gaussians computed is reduced based upon Gaussians already computed for the first stream and a number of Gaussians co-occurring in the second stream. Speech is decoded based on the Gaussians computed for the first and second streams.

    摘要翻译: 一种用于语音识别的方法包括:通过标记第一和第二流中的至少一个来确定与第一特征流和第二特征流相关的有效高斯,以及基于联合来确定在第一流和第二流中共同存在的主动高斯 可能性。 基于已经为第一个流计算的高斯和在第二个流中共同出现的高斯数,减少了计算出的高斯数。 基于为第一和第二流计算的高斯解码语音。

    SYSTEM AND METHOD FOR LIKELIHOOD COMPUTATION IN MULTI-STREAM HMM BASED SPEECH RECOGNITION
    2.
    发明申请
    SYSTEM AND METHOD FOR LIKELIHOOD COMPUTATION IN MULTI-STREAM HMM BASED SPEECH RECOGNITION 有权
    用于基于多流HMM的语音识别中的LIKELIHOOD计算的系统和方法

    公开(公告)号:US20080235015A1

    公开(公告)日:2008-09-25

    申请号:US12131190

    申请日:2008-06-02

    IPC分类号: G10L19/00

    CPC分类号: G10L15/144

    摘要: A system and method for speech recognition includes determining active Gaussians related to a first feature stream and a second feature stream by labeling at least one of the first and second streams, and determining active Gaussians co-occurring in the first stream and the second stream based upon joint probability. A number of Gaussians computed is reduced based upon Gaussians already computed for the first stream and a number of Gaussians co-occurring in the second stream. Speech is decoded based on the Gaussians computed for the first and second streams.

    摘要翻译: 用于语音识别的系统和方法包括:通过标记第一和第二流中的至少一个来确定与第一特征流和第二特征流相关的活动高斯,以及确定在第一流和第二流中共存的活动高斯 联合概率。 基于已经为第一个流计算的高斯和在第二个流中共同出现的高斯数,减少了计算出的高斯数。 基于为第一和第二流计算的高斯解码语音。

    System and method for likelihood computation in multi-stream HMM based speech recognition
    3.
    发明授权
    System and method for likelihood computation in multi-stream HMM based speech recognition 有权
    用于基于多流HMM的语音识别中的似然计算的系统和方法

    公开(公告)号:US08121840B2

    公开(公告)日:2012-02-21

    申请号:US12131190

    申请日:2008-06-02

    IPC分类号: G10L15/14

    CPC分类号: G10L15/144

    摘要: A system and method for speech recognition includes determining active Gaussians related to a first feature stream and a second feature stream by labeling at least one of the first and second streams, and determining active Gaussians co-occurring in the first stream and the second stream based upon joint probability. A number of Gaussians computed is reduced based upon Gaussians already computed for the first stream and a number of Gaussians co-occurring in the second stream. Speech is decoded based on the Gaussians computed for the first and second streams.

    摘要翻译: 用于语音识别的系统和方法包括:通过标记第一和第二流中的至少一个来确定与第一特征流和第二特征流相关的活动高斯,以及确定在第一流和第二流中共存的活动高斯 联合概率。 基于已经为第一个流计算的高斯和在第二个流中共同出现的高斯数,减少了计算出的高斯数。 基于为第一和第二流计算的高斯解码语音。

    System and method for likelihood computation in multi-stream HMM based speech recognition
    4.
    发明申请
    System and method for likelihood computation in multi-stream HMM based speech recognition 有权
    用于基于多流HMM的语音识别中的似然计算的系统和方法

    公开(公告)号:US20060074654A1

    公开(公告)日:2006-04-06

    申请号:US10946381

    申请日:2004-09-21

    IPC分类号: G10L15/08

    CPC分类号: G10L15/144

    摘要: A system and method for speech recognition includes determining active Gaussians related to a first feature stream and a second feature stream by labeling at least one of the first and second streams, and determining active Gaussians co-occurring in the first stream and the second stream based upon joint probability. A number of Gaussians computed is reduced based upon Gaussians already computed for the first stream and a number of Gaussians co-occurring in the second stream. Speech is decoded based on the Gaussians computed for the first and second streams.

    摘要翻译: 用于语音识别的系统和方法包括:通过标记第一和第二流中的至少一个来确定与第一特征流和第二特征流相关的活动高斯,以及确定在第一流和第二流中共存的活动高斯 联合概率。 基于已经为第一个流计算的高斯和在第二个流中共同出现的高斯数,减少了计算出的高斯数。 基于为第一和第二流计算的高斯解码语音。

    Audio-only backoff in audio-visual speech recognition system
    5.
    发明授权
    Audio-only backoff in audio-visual speech recognition system 有权
    音视频语音识别系统中的音频回退

    公开(公告)号:US07251603B2

    公开(公告)日:2007-07-31

    申请号:US10601350

    申请日:2003-06-23

    IPC分类号: G10L21/00

    CPC分类号: G10L15/25

    摘要: Techniques for performing audio-visual speech recognition, with improved recognition performance, in a degraded visual environment. For example, in one aspect of the invention, a technique for use in accordance with an audio-visual speech recognition system for improving a recognition performance thereof includes the steps/operations of: (i) selecting between an acoustic-only data model and an acoustic-visual data model based on a condition associated with a visual environment; and (ii) decoding at least a portion of an input spoken utterance using the selected data model. Advantageously, during periods of degraded visual conditions, the audio-visual speech recognition system is able to decode (recognize) input speech data using audio-only data, thus avoiding recognition inaccuracies that may result from performing speech recognition based on acoustic-visual data models and degraded visual data.

    摘要翻译: 在劣化的视觉环境中执行视听语音识别技术,具有改进的识别性能。 例如,在本发明的一个方面,根据用于改善其识别性能的视听语音识别系统使用的技术包括以下步骤/操作:(i)在仅声学数据模型和 基于与视觉环境相关的条件的声学可视数据模型; 以及(ii)使用所选择的数据模型解码输入口头发音的至少一部分。 有利的是,在恶化的视觉条件期间,视听语音识别系统能够使用仅音频数据解码(识别)输入语音数据,从而避免了基于声学可视数据模型执行语音识别可能导致的识别不准确 并降低视觉数据。

    Compressing Feature Space Transforms
    6.
    发明申请
    Compressing Feature Space Transforms 有权
    压缩特征空间变换

    公开(公告)号:US20110144991A1

    公开(公告)日:2011-06-16

    申请号:US12636033

    申请日:2009-12-11

    IPC分类号: G10L15/06

    CPC分类号: G10L19/0212 G10L19/032

    摘要: Methods for compressing a transform associated with a feature space are presented. For example, a method for compressing a transform associated with a feature space includes obtaining the transform including a plurality of transform parameters, assigning each of a plurality of quantization levels for the plurality of transform parameters to one of a plurality of quantization values, and assigning each of the plurality of transform parameters to one of the plurality of quantization values to which one of the plurality of quantization levels is assigned. One or more of obtaining the transform, assigning of each of the plurality of quantization levels, and assigning of each of the transform parameters are implemented as instruction code executed on a processor device. Further, a Viterbi algorithm may be employed for use in non-uniform level/value assignments.

    摘要翻译: 提出了用于压缩与特征空间相关联的变换的方法。 例如,用于压缩与特征空间相关联的变换的方法包括获得包括多个变换参数的变换,将多个变换参数的多个量化级别中的每一个分配给多个量化值中的一个,以及分配 所述多个变换参数中的每一个变换为分配了所述多个量化级中的一个的所述多个量化值之一。 获得变换,分配多个量化级别中的每一个以及每个变换参数的分配中的一个或多个被实现为在处理器设备上执行的指令代码。 此外,维特比算法可用于非均匀级/值分配中。

    Compressing feature space transforms
    7.
    发明授权
    Compressing feature space transforms 有权
    压缩特征空间转换

    公开(公告)号:US08386249B2

    公开(公告)日:2013-02-26

    申请号:US12636033

    申请日:2009-12-11

    IPC分类号: G10L15/06

    CPC分类号: G10L19/0212 G10L19/032

    摘要: Methods for compressing a transform associated with a feature space are presented. For example, a method for compressing a transform associated with a feature space includes obtaining the transform including a plurality of transform parameters, assigning each of a plurality of quantization levels for the plurality of transform parameters to one of a plurality of quantization values, and assigning each of the plurality of transform parameters to one of the plurality of quantization values to which one of the plurality of quantization levels is assigned. One or more of obtaining the transform, assigning of each of the plurality of quantization levels, and assigning of each of the transform parameters are implemented as instruction code executed on a processor device. Further, a Viterbi algorithm may be employed for use in non-uniform level/value assignments.

    摘要翻译: 提出了用于压缩与特征空间相关联的变换的方法。 例如,用于压缩与特征空间相关联的变换的方法包括获得包括多个变换参数的变换,将多个变换参数的多个量化级别中的每一个分配给多个量化值中的一个,以及分配 所述多个变换参数中的每一个变换为分配了所述多个量化级中的一个的所述多个量化值之一。 获得变换,分配多个量化级别中的每一个以及每个变换参数的分配中的一个或多个被实现为在处理器设备上执行的指令代码。 此外,维特比算法可用于非均匀级/值分配中。

    Speech detection fusing multi-class acoustic-phonetic, and energy features
    9.
    发明申请
    Speech detection fusing multi-class acoustic-phonetic, and energy features 审中-公开
    语音检测融合了多类声音和能量特征

    公开(公告)号:US20070033042A1

    公开(公告)日:2007-02-08

    申请号:US11196698

    申请日:2005-08-03

    IPC分类号: G10L15/00

    CPC分类号: G10L25/78 G10L2015/025

    摘要: A speech detection system extracts a plurality of features from multiple input streams. In the acoustic model space, the tree of Gaussians in the model is pruned to include the active states. The Gaussians are mapped to Hidden Markov Model states for Viterbi phoneme alignment. Another feature space, such as the energy feature space is combined with the acoustic feature space. In the feature space, the features are combined and principal component analysis decorrelates the features to fewer dimensions, thus reducing the number of features. The Gaussians are also mapped to silence, disfluent phoneme, or voiced phoneme classes. The silence class is true silence and the voiced phoneme class is speech. The disfluent class may be speech or non-speech. If a frame is classified as disfluent, then that frame is re-classified as the silence class or the voiced phoneme class based on adjacent frames.

    摘要翻译: 语音检测系统从多个输入流中提取多个特征。 在声学模型空间中,模型中的高斯树被修剪为包括活动状态。 高斯语映射到维特比音调对齐的隐马尔可夫模型状态。 另一个特征空间,如能量特征空间,与声学特征空间相结合。 在特征空间中,特征被组合并且主成分分析将特征相关联以减少尺寸,从而减少特征的数量。 高斯还被映射到沉默,不充分的音素或有声音的音素课。 沉默阶级是真正的沉默,有声的音素班是讲话。 贫穷阶层可能是言语或非言语。 如果帧被分类为不充分,则该帧被重新分类为基于相邻帧的静音类或有声音素类。

    TECHNIQUES FOR EVALUATION, BUILDING AND/OR RETRAINING OF A CLASSIFICATION MODEL

    公开(公告)号:US20130254153A1

    公开(公告)日:2013-09-26

    申请号:US13429041

    申请日:2012-03-23

    申请人: Etienne Marcheret

    发明人: Etienne Marcheret

    IPC分类号: G06N5/00

    CPC分类号: G06N99/005 G06N7/00 G06N7/005

    摘要: Techniques for evaluation and/or retraining of a classification model built using labeled training data. In some aspects, a classification model having a first set of weights is retrained by using unlabeled input to reweight the labeled training data to have a second set of weights, and by retraining the classification model using the labeled training data weighted according to the second set of weights. In some aspects, a classification model is evaluated by building a similarity model that represents similarities between unlabeled input and the labeled training data and using the similarity model to evaluate the labeled training data to identify a subset of the plurality of items of labeled training data that is more similar to the unlabeled input than a remainder of the labeled training data.