Time synchronous decoding for long-span hidden trajectory model
    1.
    发明授权
    Time synchronous decoding for long-span hidden trajectory model 有权
    长跨隐藏轨迹模型的时间同步解码

    公开(公告)号:US07877256B2

    公开(公告)日:2011-01-25

    申请号:US11356905

    申请日:2006-02-17

    IPC分类号: G10L15/14

    CPC分类号: G10L15/08

    摘要: A time-synchronous lattice-constrained search algorithm is developed and used to process a linguistic model of speech that has a long-contextual-span capability. In the algorithm, hypotheses are represented as traces that include an indication of a current frame, previous frames and future frames. Each frame can include an associated linguistic unit such as a phone or units that are derived from a phone. Additionally, pruning strategies can be applied to speed up the search. Further, word-ending recombination methods are developed to speed up the computation. These methods can effectively deal with an exponentially increased search space.

    摘要翻译: 开发了一种时间同步的格格约束搜索算法,用于处理具有长语境跨度能力的语言语言模型。 在算法中,假设被表示为包括当前帧,先前帧和未来帧的指示的迹线。 每个帧可以包括相关联的语言单元,例如从电话派生的电话或单元。 此外,可以应用修剪策略来加快搜索速度。 此外,开发了文字重组方法以加速计算。 这些方法可以有效地处理指数级增加的搜索空间。

    Time synchronous decoding for long-span hidden trajectory model
    2.
    发明申请
    Time synchronous decoding for long-span hidden trajectory model 有权
    长跨隐藏轨迹模型的时间同步解码

    公开(公告)号:US20070198266A1

    公开(公告)日:2007-08-23

    申请号:US11356905

    申请日:2006-02-17

    IPC分类号: G10L15/28

    CPC分类号: G10L15/08

    摘要: A time-synchronous lattice-constrained search algorithm is developed and used to process a linguistic model of speech that has a long-contextual-span capability. In the algorithm, hypotheses are represented as traces that include an indication of a current frame, previous frames and future frames. Each frame can include an associated linguistic unit such as a phone or units that are derived from a phone. Additionally, pruning strategies can be applied to speed up the search. Further, word-ending recombination methods are developed to speed up the computation. These methods can effectively deal with an exponentially increased search space.

    摘要翻译: 开发了一种时间同步的格格约束搜索算法,用于处理具有长语境跨度能力的语言语言模型。 在算法中,假设被表示为包括当前帧,先前帧和未来帧的指示的迹线。 每个帧可以包括相关联的语言单元,例如从电话派生的电话或单元。 此外,可以应用修剪策略来加快搜索速度。 此外,开发了文字重组方法以加速计算。 这些方法可以有效地处理指数级增加的搜索空间。

    Parameter learning in a hidden trajectory model
    3.
    发明申请
    Parameter learning in a hidden trajectory model 有权
    隐藏轨迹模型中的参数学习

    公开(公告)号:US20070198260A1

    公开(公告)日:2007-08-23

    申请号:US11356898

    申请日:2006-02-17

    IPC分类号: G10L15/00

    CPC分类号: G10L15/063 G10L2015/025

    摘要: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.

    摘要翻译: 使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。 该估计仅包括声学数据,而不包括对隐藏的动态变量的任何中间估计。 可以开发梯度上升方法来优化声似然函数。

    Parameter learning in a hidden trajectory model
    4.
    发明授权
    Parameter learning in a hidden trajectory model 有权
    隐藏轨迹模型中的参数学习

    公开(公告)号:US08942978B2

    公开(公告)日:2015-01-27

    申请号:US13182971

    申请日:2011-07-14

    IPC分类号: G10L15/00 G10L15/06

    CPC分类号: G10L15/063 G10L2015/025

    摘要: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.

    摘要翻译: 使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。 该估计仅包括声学数据,而不包括对隐藏的动态变量的任何中间估计。 可以开发梯度上升方法来优化声似然函数。

    PARAMETER LEARNING IN A HIDDEN TRAJECTORY MODEL
    5.
    发明申请
    PARAMETER LEARNING IN A HIDDEN TRAJECTORY MODEL 有权
    参数学习在隐藏的TRAJECTORY模型

    公开(公告)号:US20110270610A1

    公开(公告)日:2011-11-03

    申请号:US13182971

    申请日:2011-07-14

    IPC分类号: G10L15/00

    CPC分类号: G10L15/063 G10L2015/025

    摘要: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.

    摘要翻译: 使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。 该估计仅包括声学数据,而不包括对隐藏的动态变量的任何中间估计。 可以开发梯度上升方法来优化声似然函数。

    Parameter learning in a hidden trajectory model
    6.
    发明授权
    Parameter learning in a hidden trajectory model 有权
    隐藏轨迹模型中的参数学习

    公开(公告)号:US08010356B2

    公开(公告)日:2011-08-30

    申请号:US11356898

    申请日:2006-02-17

    IPC分类号: G10L15/00

    CPC分类号: G10L15/063 G10L2015/025

    摘要: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.

    摘要翻译: 使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。 该估计仅包括声学数据,而不包括对隐藏的动态变量的任何中间估计。 可以开发梯度上升方法来优化声似然函数。

    Automatic reading tutoring with parallel polarized language modeling
    7.
    发明申请
    Automatic reading tutoring with parallel polarized language modeling 有权
    使用平行极化语言建模的自动阅读辅导

    公开(公告)号:US20080177545A1

    公开(公告)日:2008-07-24

    申请号:US11655702

    申请日:2007-01-19

    IPC分类号: G10L15/28

    摘要: A novel system for automatic reading tutoring provides effective error detection and reduced false alarms combined with low processing time burdens and response times short enough to maintain a natural, engaging flow of interaction. According to one illustrative embodiment, an automatic reading tutoring method includes displaying a text output and receiving an acoustic input. The acoustic input is modeled with a domain-specific target language model specific to the text output, and with a general-domain garbage language model, both of which may be efficiently constructed as context-free grammars. The domain-specific target language model may be built dynamically or “on-the-fly” based on the currently displayed text (e.g. the story to be read by the user), while the general-domain garbage language model is shared among all different text outputs. User-perceptible tutoring feedback is provided based on the target language model and the garbage language model.

    摘要翻译: 用于自动阅读辅导的新颖系统提供了有效的错误检测和减少的假警报以及较短的处理时间负担和响应时间足够短以保持自然的,互动的互动流。 根据一个说明性实施例,自动阅读辅导方法包括显示文本输出并接收声输入。 声输入是用专门针对文本输出的领域特定的目标语言模型建立的,并且具有通用域垃圾语言模型,这两种语言模型都可以被有效地构建为无上下文的语法。 可以基于当前显示的文本(例如,用户要阅读的故事)动态地或“即时”地构建特定领域的目标语言模型,而一般域垃圾语言模型在所有不同的方式之间共享 文本输出。 基于目标语言模型和垃圾语言模型提供了用户可感知的辅导反馈。

    Automatic reading tutoring with parallel polarized language modeling
    8.
    发明授权
    Automatic reading tutoring with parallel polarized language modeling 有权
    使用平行极化语言建模的自动阅读辅导

    公开(公告)号:US08433576B2

    公开(公告)日:2013-04-30

    申请号:US11655702

    申请日:2007-01-19

    IPC分类号: G10L15/22

    摘要: A novel system for automatic reading tutoring provides effective error detection and reduced false alarms combined with low processing time burdens and response times short enough to maintain a natural, engaging flow of interaction. According to one illustrative embodiment, an automatic reading tutoring method includes displaying a text output and receiving an acoustic input. The acoustic input is modeled with a domain-specific target language model specific to the text output, and with a general-domain garbage language model, both of which may be efficiently constructed as context-free grammars. The domain-specific target language model may be built dynamically or “on-the-fly” based on the currently displayed text (e.g. the story to be read by the user), while the general-domain garbage language model is shared among all different text outputs. User-perceptible tutoring feedback is provided based on the target language model and the garbage language model.

    摘要翻译: 用于自动阅读辅导的新颖系统提供了有效的错误检测和减少的假警报以及较短的处理时间负担和响应时间足够短以保持自然的,互动的互动流。 根据一个说明性实施例,自动阅读辅导方法包括显示文本输出并接收声输入。 声输入是用专门针对文本输出的领域特定的目标语言模型建立的,并且具有通用域垃圾语言模型,这两种语言模型都可以被有效地构建为无上下文的语法。 可以基于当前显示的文本(例如,用户要阅读的故事)动态地或“即时”地构建特定领域的目标语言模型,而一般域垃圾语言模型在所有不同的方式之间共享 文本输出。 基于目标语言模型和垃圾语言模型提供了用户可感知的辅导反馈。

    DEEP CONVEX NETWORK WITH JOINT USE OF NONLINEAR RANDOM PROJECTION, RESTRICTED BOLTZMANN MACHINE AND BATCH-BASED PARALLELIZABLE OPTIMIZATION
    9.
    发明申请
    DEEP CONVEX NETWORK WITH JOINT USE OF NONLINEAR RANDOM PROJECTION, RESTRICTED BOLTZMANN MACHINE AND BATCH-BASED PARALLELIZABLE OPTIMIZATION 有权
    连续使用非线性随机投影,限制性BOLTZMANN机器和基于批量的平行优化的深层网络

    公开(公告)号:US20120254086A1

    公开(公告)日:2012-10-04

    申请号:US13077978

    申请日:2011-03-31

    IPC分类号: G06N3/08

    摘要: A method is disclosed herein that includes an act of causing a processor to access a deep-structured, layered or hierarchical model, called deep convex network, retained in a computer-readable medium, wherein the deep-structured model comprises a plurality of layers with weights assigned thereto. This layered model can produce the output serving as the scores to combine with transition probabilities between states in a hidden Markov model and language model scores to form a full speech recognizer. The method makes joint use of nonlinear random projections and RBM weights, and it stacks a lower module's output with the raw data to establish its immediately higher module. Batch-based, convex optimization is performed to learn a portion of the deep convex network's weights, rendering it appropriate for parallel computation to accomplish the training. The method can further include the act of jointly substantially optimizing the weights, the transition probabilities, and the language model scores of the deep-structured model using the optimization criterion based on a sequence rather than a set of unrelated frames.

    摘要翻译: 本文公开了一种方法,其包括使处理器访问被保留在计算机可读介质中的称为深凸网络的深层结构的分层或层次模型的动作,其中深层结构模型包括多个具有 分配给它的权重。 该分层模型可以产生作为分数的输出,以与隐藏的马尔可夫模型和语言模型分数中的状态之间的转移概率相结合,以形成完整的语音识别器。 该方法联合使用非线性随机投影和RBM权重,并将较低模块的输出与原始数据叠加以建立其立即更高的模块。 执行基于批次的凸优化来学习深凸网络权重的一部分,使其适合于并行计算以完成训练。 该方法还可以包括使用基于序列而不是一组不相关帧的优化准则共同基本优化深层结构模型的权重,转移概率和语言模型分数的动作。

    Noise suppressor for robust speech recognition
    10.
    发明授权
    Noise suppressor for robust speech recognition 有权
    噪声抑制器用于强大的语音识别

    公开(公告)号:US08185389B2

    公开(公告)日:2012-05-22

    申请号:US12335558

    申请日:2008-12-16

    IPC分类号: G10L15/20

    CPC分类号: G10L21/0208 G10L15/20

    摘要: Described is noise reduction technology generally for speech input in which a noise-suppression related gain value for the frame is determined based upon a noise level associated with that frame in addition to the signal to noise ratios (SNRs). In one implementation, a noise reduction mechanism is based upon minimum mean square error, Mel-frequency cepstra noise reduction technology. A high gain value (e.g., one) is set to accomplish little or no noise suppression when the noise level is below a threshold low level, and a low gain value set or computed to accomplish large noise suppression above a threshold high noise level. A noise-power dependent function, e.g., a log-linear interpolation, is used to compute the gain between the thresholds. Smoothing may be performed by modifying the gain value based upon a prior frame's gain value. Also described is learning parameters used in noise reduction via a step-adaptive discriminative learning algorithm.

    摘要翻译: 描述了通常用于语音输入的噪声降低技术,其中除了信噪比(SNR)之外,基于与该帧相关联的噪声电平来确定用于帧的噪声抑制相关增益值。 在一个实现中,降噪机制基于最小均方误差,Mel-frequency cepstra降噪技术。 设置高增益值(例如一个),以在噪声电平低于阈值低电平时实现很少或没有噪声抑制,以及设置或计算的低增益值,以实现高于阈值高噪声电平的大噪声抑制。 使用噪声功率相关函数,例如对数线性插值来计算阈值之间的增益。 可以通过基于先前帧的增益值修改增益值来执行平滑化。 还描述了通过步进自适应识别学习算法在降噪中使用的学习参数。