Incrementally regulated discriminative margins in MCE training for speech recognition
    1.
    发明申请
    Incrementally regulated discriminative margins in MCE training for speech recognition 有权
    增加对语音识别的MCE训练中的歧视性空白

    公开(公告)号:US20080052075A1

    公开(公告)日:2008-02-28

    申请号:US11509980

    申请日:2006-08-25

    IPC分类号: G10L15/14

    CPC分类号: G10L15/063 G10L15/144

    摘要: A method and apparatus for training an acoustic model are disclosed. A training corpus is accessed and converted into an initial acoustic model. Scores are calculated for a correct class and competitive classes, respectively, for each token given the acoustic model. From this score a misclassification measure is calculated and then a loss function is calculated from the misclassification measure. The loss function also includes a margin value that varies over each iteration in the training. Based on the calculated loss function the acoustic model is updated, where the loss function with the margin value is minimized. This process repeats until such time as an empirical convergence is met.

    摘要翻译: 公开了一种用于训练声学模型的方法和装置。 训练语料库被访问并转换成初始声学模型。 对于给定声学模型的每个令牌,分数是针对正确的班级和竞赛班分别计算的。 从该分数计算错误分类度量,然后根据误分类度量计算损失函数。 损失函数还包括在训练中每次迭代变化的保证金值。 基于计算的损耗函数,声学模型被更新,其中具有边际值的损失函数被最小化。 该过程重复,直到满足经验收敛的时间为止。

    Incrementally regulated discriminative margins in MCE training for speech recognition
    2.
    发明授权
    Incrementally regulated discriminative margins in MCE training for speech recognition 有权
    增加对语音识别的MCE训练中的歧视性空白

    公开(公告)号:US07617103B2

    公开(公告)日:2009-11-10

    申请号:US11509980

    申请日:2006-08-25

    IPC分类号: G10L15/14

    CPC分类号: G10L15/063 G10L15/144

    摘要: A method and apparatus for training an acoustic model are disclosed. A training corpus is accessed and converted into an initial acoustic model. Scores are calculated for a correct class and competitive classes, respectively, for each token given the acoustic model. From this score a misclassification measure is calculated and then a loss function is calculated from the misclassification measure. The loss function also includes a margin value that varies over each iteration in the training. Based on the calculated loss function the acoustic model is updated, where the loss function with the margin value is minimized. This process repeats until such time as an empirical convergence is met.

    摘要翻译: 公开了一种用于训练声学模型的方法和装置。 训练语料库被访问并转换成初始声学模型。 对于给定声学模型的每个令牌,分数是针对正确的班级和竞赛班分别计算的。 从该分数计算错误分类度量,然后根据误分类度量计算损失函数。 损失函数还包括在训练中每次迭代变化的保证金值。 基于计算的损耗函数,声学模型被更新,其中具有边际值的损失函数被最小化。 该过程重复,直到满足经验收敛的时间为止。

    Generic framework for large-margin MCE training in speech recognition
    3.
    发明申请
    Generic framework for large-margin MCE training in speech recognition 有权
    语言识别中大面积MCE培训的通用框架

    公开(公告)号:US20080201139A1

    公开(公告)日:2008-08-21

    申请号:US11708440

    申请日:2007-02-20

    IPC分类号: G10L15/00

    CPC分类号: G10L15/063 G10L2015/0631

    摘要: A method and apparatus for training an acoustic model are disclosed. A training corpus is accessed and converted into an initial acoustic model. Scores are calculated for a correct class and competitive classes, respectively, for each token given the initial acoustic model. Also, a sample-adaptive window bandwidth is calculated for each training token. From the calculated scores and the sample-adaptive window bandwidth values, loss values are calculated based on a loss function. The loss function, which may be derived from a Bayesian risk minimization viewpoint, can include a margin value that moves a decision boundary such that token-to-boundary distances for correct tokens that are near the decision boundary are maximized. The margin can either be a fixed margin or can vary monotonically as a function of algorithm iterations. The acoustic model is updated based on the calculated loss values. This process can be repeated until an empirical convergence is met.

    摘要翻译: 公开了一种用于训练声学模型的方法和装置。 训练语料库被访问并转换成初始声学模型。 对于给定初始声学模型的每个令牌,分数计算分别为正确的类和竞争类。 此外,针对每个训练令牌计算样本自适应窗口带宽。 从计算出的分数和采样自适应窗口带宽值,根据损失函数计算损失值。 可以从贝叶斯风险最小化观点导出的损失函数可以包括移动判定边界的边距值,使得靠近判定边界的正确令牌的令牌到边界的距离最大化。 边距可以是固定边距,也可以作为算法迭代的函数单调变化。 基于计算的损失值更新声学模型。 可以重复该过程,直到满足经验收敛。

    Generic framework for large-margin MCE training in speech recognition
    4.
    发明授权
    Generic framework for large-margin MCE training in speech recognition 有权
    语言识别中大面积MCE培训的通用框架

    公开(公告)号:US08423364B2

    公开(公告)日:2013-04-16

    申请号:US11708440

    申请日:2007-02-20

    IPC分类号: G10L15/14 G10L15/00 G10L15/06

    CPC分类号: G10L15/063 G10L2015/0631

    摘要: A method and apparatus for training an acoustic model are disclosed. A training corpus is accessed and converted into an initial acoustic model. Scores are calculated for a correct class and competitive classes, respectively, for each token given the initial acoustic model. Also, a sample-adaptive window bandwidth is calculated for each training token. From the calculated scores and the sample-adaptive window bandwidth values, loss values are calculated based on a loss function. The loss function, which may be derived from a Bayesian risk minimization viewpoint, can include a margin value that moves a decision boundary such that token-to-boundary distances for correct tokens that are near the decision boundary are maximized. The margin can either be a fixed margin or can vary monotonically as a function of algorithm iterations. The acoustic model is updated based on the calculated loss values. This process can be repeated until an empirical convergence is met.

    摘要翻译: 公开了一种用于训练声学模型的方法和装置。 训练语料库被访问并转换成初始声学模型。 对于给定初始声学模型的每个令牌,分数计算分别为正确的类和竞争类。 此外,针对每个训练令牌计算样本自适应窗口带宽。 从计算出的分数和采样自适应窗口带宽值,根据损失函数计算损失值。 可以从贝叶斯风险最小化观点导出的损失函数可以包括移动判定边界的边距值,使得靠近判定边界的正确令牌的令牌到边界的距离最大化。 边距可以是固定边距,也可以作为算法迭代的函数单调变化。 基于计算的损失值更新声学模型。 可以重复该过程,直到满足经验收敛。

    Parameter learning in a hidden trajectory model
    5.
    发明授权
    Parameter learning in a hidden trajectory model 有权
    隐藏轨迹模型中的参数学习

    公开(公告)号:US08942978B2

    公开(公告)日:2015-01-27

    申请号:US13182971

    申请日:2011-07-14

    IPC分类号: G10L15/00 G10L15/06

    CPC分类号: G10L15/063 G10L2015/025

    摘要: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.

    摘要翻译: 使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。 该估计仅包括声学数据,而不包括对隐藏的动态变量的任何中间估计。 可以开发梯度上升方法来优化声似然函数。

    Exploiting sparseness in training deep neural networks
    6.
    发明授权
    Exploiting sparseness in training deep neural networks 有权
    在深层神经网络训练中利用稀疏性

    公开(公告)号:US08700552B2

    公开(公告)日:2014-04-15

    申请号:US13305741

    申请日:2011-11-28

    IPC分类号: G06F15/18 G06N3/08

    CPC分类号: G06N3/08

    摘要: Deep Neural Network (DNN) training technique embodiments are presented that train a DNN while exploiting the sparseness of non-zero hidden layer interconnection weight values. Generally, a fully connected DNN is initially trained by sweeping through a full training set a number of times. Then, for the most part, only the interconnections whose weight magnitudes exceed a minimum weight threshold are considered in further training. This minimum weight threshold can be established as a value that results in only a prescribed maximum number of interconnections being considered when setting interconnection weight values via an error back-propagation procedure during the training. It is noted that the continued DNN training tends to converge much faster than the initial training.

    摘要翻译: 提出了深层神经网络(DNN)训练技术实施例,其训练DNN,同时利用非零隐藏层互连权重值的稀疏性。 通常,完全连接的DNN最初通过遍历完整的训练集多次进行训练。 那么,在大多数情况下,只有重量大小超过最小重量阈值的互连在进一步的训练中被考虑。 该最小权重阈值可以被建立为在训练期间通过错误反向传播过程设置互连权重值时仅考虑规定的最大数量的互连的值。 值得注意的是,继续进行的DNN训练往往比初始训练快得多。

    COMPUTER-IMPLEMENTED DEEP TENSOR NEURAL NETWORK
    7.
    发明申请
    COMPUTER-IMPLEMENTED DEEP TENSOR NEURAL NETWORK 有权
    计算机实现深度传感器神经网络

    公开(公告)号:US20140067735A1

    公开(公告)日:2014-03-06

    申请号:US13597268

    申请日:2012-08-29

    IPC分类号: G06N3/08

    摘要: A deep tensor neural network (DTNN) is described herein, wherein the DTNN is suitable for employment in a computer-implemented recognition/classification system. Hidden layers in the DTNN comprise at least one projection layer, which includes a first subspace of hidden units and a second subspace of hidden units. The first subspace of hidden units receives a first nonlinear projection of input data to a projection layer and generates the first set of output data based at least in part thereon, and the second subspace of hidden units receives a second nonlinear projection of the input data to the projection layer and generates the second set of output data based at least in part thereon. A tensor layer, which can converted into a conventional layer of a DNN, generates the third set of output data based upon the first set of output data and the second set of output data.

    摘要翻译: 本文描述了深张量神经网络(DTNN),其中DTNN适合于在计算机实现的识别/分类系统中的使用。 DTNN中的隐藏层包括至少一个投影层,其包括隐藏单元的第一子空间和隐藏单元的第二子空间。 隐藏单元的第一子空间至少部分地将输入数据的第一非线性投影接收到投影层,并且至少部分地生成第一组输出数据,并且隐藏单元的第二子空间接收输入数据的第二非线性投影 投影层并且至少部分地基于其生成第二组输出数据。 可以转换成DNN的常规层的张量层基于第一组输出数据和第二组输出数据产生第三组输出数据。

    Adapting a compressed model for use in speech recognition
    8.
    发明授权
    Adapting a compressed model for use in speech recognition 有权
    适应用于语音识别的压缩模型

    公开(公告)号:US08239195B2

    公开(公告)日:2012-08-07

    申请号:US12235748

    申请日:2008-09-23

    IPC分类号: G10L15/20

    CPC分类号: G10L15/20 G10L15/065

    摘要: A speech recognition system includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an adaptor component that selectively adapts parameters of a compressed model used to recognize at least a portion of the distorted speech utterance, wherein the adaptor component selectively adapts the parameters of the compressed model based at least in part upon the received distorted speech utterance.

    摘要翻译: 语音识别系统包括接收失真的语音话语的接收机组件。 所述语音识别还包括适配器组件,所述适配器组件选择性地适配用于识别所述失真语音话语的至少一部分的压缩模型的参数,其中所述适配器组件至少部分地基于接收失真的语音话语选择性地调整所述压缩模型的参数 讲话话语。

    PARAMETER LEARNING IN A HIDDEN TRAJECTORY MODEL
    9.
    发明申请
    PARAMETER LEARNING IN A HIDDEN TRAJECTORY MODEL 有权
    参数学习在隐藏的TRAJECTORY模型

    公开(公告)号:US20110270610A1

    公开(公告)日:2011-11-03

    申请号:US13182971

    申请日:2011-07-14

    IPC分类号: G10L15/00

    CPC分类号: G10L15/063 G10L2015/025

    摘要: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.

    摘要翻译: 使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。 该估计仅包括声学数据,而不包括对隐藏的动态变量的任何中间估计。 可以开发梯度上升方法来优化声似然函数。

    Parameter learning in a hidden trajectory model
    10.
    发明授权
    Parameter learning in a hidden trajectory model 有权
    隐藏轨迹模型中的参数学习

    公开(公告)号:US08010356B2

    公开(公告)日:2011-08-30

    申请号:US11356898

    申请日:2006-02-17

    IPC分类号: G10L15/00

    CPC分类号: G10L15/063 G10L2015/025

    摘要: Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.

    摘要翻译: 使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。 该估计仅包括声学数据,而不包括对隐藏的动态变量的任何中间估计。 可以开发梯度上升方法来优化声似然函数。