MAXIMUM ENTROPY MODEL WITH CONTINUOUS FEATURES
    71.
    发明申请
    MAXIMUM ENTROPY MODEL WITH CONTINUOUS FEATURES 审中-公开
    具有连续特征的最大熵模型

    公开(公告)号:US20100256977A1

    公开(公告)日:2010-10-07

    申请号:US12416161

    申请日:2009-04-01

    IPC分类号: G10L15/00

    摘要: Described is a technology by which a maximum entropy (MaxEnt) model, such as used as a classifier or in a conditional random field or hidden conditional random field that embed the maximum entropy model, uses continuous features with continuous weights that are continuous functions of the feature values (instead of single-valued weights). The continuous weights may be approximated by a spline-based solution. In general, this converts the optimization problem into a standard log-linear optimization problem without continuous weights at a higher-dimensional space.

    摘要翻译: 描述了最大熵(MaxEnt)模型,例如用作分类器或嵌入最大熵模型的条件随机场或隐藏条件随机场的最大熵(MaxEnt)模型使用具有连续权重的连续特征,连续权重是连续权重, 特征值(而不是单值权重)。 连续权重可以通过基于样条的解决方案近似。 一般来说,这将优化问题转化为标准的对数线性优化问题,而在较高维度的空间则没有连续权重。

    AUDIO TRANSFORMS IN CONNECTION WITH MULTIPARTY COMMUNICATION
    72.
    发明申请
    AUDIO TRANSFORMS IN CONNECTION WITH MULTIPARTY COMMUNICATION 有权
    与多媒体通信相关的音频转换

    公开(公告)号:US20100195812A1

    公开(公告)日:2010-08-05

    申请号:US12365949

    申请日:2009-02-05

    IPC分类号: H04M3/42 G10L11/00

    摘要: The claimed subject matter relates to an architecture that can preprocess audio portions of communications in order to enrich multiparty communication sessions or environments. In particular, the architecture can provide both a public channel for public communications that are received by substantially all connected parties and can further provide a private channel for private communications that are received by a selected subset of all connected parties. Most particularly, the architecture can apply an audio transform to communications that occur during the multiparty communication session based upon a target audience of the communication. By way of illustration, the architecture can apply a whisper transform to private communications, an emotion transform based upon relationships, an ambience or spatial transform based upon physical locations, or a pace transform based upon lack of presence.

    摘要翻译: 所要求保护的主题涉及可以预处理通信的音频部分以便丰富多方通信会话或环境的架构。 特别地,该架构可以提供公共通信的公共信道,其由基本上所有连接的各方接收,并且可以进一步提供由所有连接方的所选子集接收的专用通信的专用信道。 特别地,架构可以基于通信的目标受众对音频转换应用于在多方通信会话期间发生的通信。 作为说明,架构可以对私人通信应用耳语转换,基于关系,基于物理位置的氛围或空间变换或基于缺乏存在的步调变换的情感变换。

    Method of pattern recognition using noise reduction uncertainty
    73.
    发明授权
    Method of pattern recognition using noise reduction uncertainty 有权
    使用降噪不确定度的模式识别方法

    公开(公告)号:US07769582B2

    公开(公告)日:2010-08-03

    申请号:US12180260

    申请日:2008-07-25

    IPC分类号: G10L15/20 G10L21/02 G10L15/14

    CPC分类号: G10L21/0208 G10L15/20

    摘要: A method and apparatus are provided for using the uncertainty of a noise-removal process during pattern recognition. In particular, noise is removed from a representation of a portion of a noisy signal to produce a representation of a cleaned signal. In the meantime, an uncertainty associated with the noise removal is computed and is used with the representation of the cleaned signal to modify a probability for a phonetic state in the recognition system. In particular embodiments, the uncertainty is used to modify a probability distribution, by increasing the variance in each Gaussian distribution by the amount equal to the estimated variance of the cleaned signal, which is used in decoding the phonetic state sequence in a pattern recognition task.

    摘要翻译: 提供了一种在模式识别期间使用噪声去除处理的不确定性的方法和装置。 特别地,从噪声信号的一部分的表示中去除噪声以产生清洁信号的表示。 同时,计算与噪声去除有关的不确定性,并与清除信号的表示一起使用以修改识别系统中语音状态的概率。 在特定实施例中,不确定性用于通过将每个高斯分布中的方差增加等于在模式识别任务中对语音状态序列进行解码所使用的清除信号的估计方差的量来修改概率分布。

    PHASE SENSITIVE MODEL ADAPTATION FOR NOISY SPEECH RECOGNITION
    75.
    发明申请
    PHASE SENSITIVE MODEL ADAPTATION FOR NOISY SPEECH RECOGNITION 有权
    语音识别的相敏感模型适应

    公开(公告)号:US20100076758A1

    公开(公告)日:2010-03-25

    申请号:US12236530

    申请日:2008-09-24

    IPC分类号: G10L15/20 G10L15/14

    CPC分类号: G10L15/065 G10L15/20

    摘要: A speech recognition system described herein includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an updater component that is in communication with a first model and a second model, wherein the updater component automatically updates parameters of the second model based at least in part upon joint estimates of additive and convolutive distortions output by the first model, wherein the joint estimates of additive and convolutive distortions are estimates of distortions based on a phase-sensitive model in the speech utterance received by the receiver component. Further, distortions other than additive and convolutive distortions, including other stationary and nonstationary sources, can also be estimated used to update the parameters of the second model.

    摘要翻译: 本文描述的语音识别系统包括接收失真的语音话语的接收机组件。 所述语音识别还包括与第一模型和第二模型通信的更新器组件,其中所述更新器组件至少部分地基于由所述第一模型输出的加法和卷积失真的联合估计来自动更新所述第二模型的参数 其中,加法和卷积失真的联合估计是基于由接收器部件接收的语音发声中的相敏模型的失真估计。 此外,还可以估计用于更新第二模型参数的除加法和卷积失真之外的失真,包括其他静止和非平稳源。

    PARAMETER CLUSTERING AND SHARING FOR VARIABLE-PARAMETER HIDDEN MARKOV MODELS
    76.
    发明申请
    PARAMETER CLUSTERING AND SHARING FOR VARIABLE-PARAMETER HIDDEN MARKOV MODELS 有权
    参数聚类和共享可变参数隐藏式MARKOV模型

    公开(公告)号:US20100070280A1

    公开(公告)日:2010-03-18

    申请号:US12211115

    申请日:2008-09-16

    IPC分类号: G10L15/14

    CPC分类号: G10L15/142

    摘要: A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech. The VPHMMs include Gaussian parameters that vary as a function of at least one environmental conditioning parameter. The relationship of each Gaussian parameter to the environmental conditioning parameter(s) is modeled using a piecewise fitting approach, such as by using spline functions. In a training phase, the recognition system can use clustering to identify classes of spline functions, each class grouping together spline functions which are similar to each other based on some distance measure. The recognition system can then store sets of spline parameters that represent respective classes of spline functions. An instance of a spline function that belongs to a class can make reference to an associated shared set of spline parameters. The Gaussian parameters can be represented in an efficient form that accommodates the use of sharing in the above-summarized manner.

    摘要翻译: 语音识别系统使用高斯混合可变参数隐马尔可夫模型(VPHMM)来识别语音。 VPHMM包括作为至少一个环境调节参数的函数而变化的高斯参数。 每个高斯参数与环境条件参数的关系使用分段拟合方法建模,例如通过使用样条函数。 在训练阶段,识别系统可以使用聚类来识别样条函数的类别,每个类别根据一些距离度量将彼此相似的样条函数分组在一起。 识别系统然后可以存储表示各种样条函数的样条参数集合。 属于类的样条函数的一个实例可以引用相关联的一组样条参数。 高斯参数可以以适合以上述方式共享使用的有效形式来表示。

    Method of automatically ranking speech dialog states and transitions to aid in performance analysis in speech applications
    78.
    发明授权
    Method of automatically ranking speech dialog states and transitions to aid in performance analysis in speech applications 有权
    自动排序语音对话状态和转换的方法,以帮助语音应用中的性能分析

    公开(公告)号:US07643995B2

    公开(公告)日:2010-01-05

    申请号:US11054096

    申请日:2005-02-09

    IPC分类号: G10L15/00

    CPC分类号: G10L15/01 G10L15/083

    摘要: A method of identifying problems in a speech recognition application is provided and includes the step of obtaining a speech application call log containing log data on question-answer (QA) states and transitions. Then, in accordance with the method, for each of a multiple transitions between states, a parameter is generated which is indicative of a gain in a success rate of the speech recognition application if all calls passing through the transition passed instead through other transitions. In exemplary embodiments, the parameter is an Arc Cut Gain in Success Rate (ACGSR) parameter. Methods of generating the ACGSR, as well as systems and tools for aiding developers are also disclosed.

    摘要翻译: 提供了一种识别语音识别应用中的问题的方法,并且包括获得包含问答(QA)状态和转换的日志数据的语音应用呼叫日志的步骤。 然后,根据该方法,对于状态之间的多个转换中的每一个,生成指示如果通过转换的所有呼叫通过其他转换而通过语音识别应用的成功率的增益的参数。 在示例性实施例中,参数是成功率弧度增益(ACGSR)参数。 还公开了生成ACGSR的方法,以及用于帮助开发人员的系统和工具。

    Method and apparatus for indexing speech
    79.
    发明授权
    Method and apparatus for indexing speech 有权
    索引语音的方法和装置

    公开(公告)号:US07634407B2

    公开(公告)日:2009-12-15

    申请号:US11133515

    申请日:2005-05-20

    IPC分类号: G10L15/00

    CPC分类号: G10L15/26

    摘要: A method of indexing a speech segment includes identifying at least two alternative word sequences based on the speech segment. For each word in the alternative sequences, information is placed in an entry for the word in the index. The information indicates the position of the word in at least one of the alternative sequences.

    摘要翻译: 索引语音片段的方法包括基于语音片段识别至少两个替代的字序列。 对于替代序列中的每个单词,信息被放置在索引中的单词的条目中。 该信息表示在至少一个替代序列中的单词的位置。

    Hidden conditional random field models for phonetic classification and speech recognition
    80.
    发明授权
    Hidden conditional random field models for phonetic classification and speech recognition 有权
    用于语音分类和语音识别的隐藏条件随机场模型

    公开(公告)号:US07627473B2

    公开(公告)日:2009-12-01

    申请号:US10966047

    申请日:2004-10-15

    IPC分类号: G10L17/00 G10L15/14

    CPC分类号: G10L15/14

    摘要: A method and apparatus are provided for training and using a hidden conditional random field model for speech recognition and phonetic classification. The hidden conditional random field model uses feature functions, at least one of which is based on a hidden state in a phonetic unit. Values for the feature functions are determined from a segment of speech, and these values are used to identify a phonetic unit for the segment of speech.

    摘要翻译: 提供了一种用于训练和使用用于语音识别和语音分类的隐藏条件随机场模型的方法和装置。 隐藏条件随机场模型使用特征函数,其中至少一个基于语音单元中的隐藏状态。 特征函数的值由语音段确定,并且这些值用于识别语音段的语音单元。