专利检索 ap:("Li Deng" OR "Yaodong Zhang" OR "Alejandro Acero" OR "Xiaodong He") AND inv:"Alejandro Acero" 第 1 页

1.

发明授权
Integrative and discriminative technique for spoken utterance translation 有权
标题翻译：口头语言翻译的综合和歧视性技巧

公开(公告)号：US08407041B2

公开(公告)日：2013-03-26

申请号：US12957394

申请日：2010-12-01

申请人： Li Deng , Yaodong Zhang , Alejandro Acero , Xiaodong He

发明人： Li Deng , Yaodong Zhang , Alejandro Acero , Xiaodong He

IPC分类号： G06F17/28

CPC分类号： G06F17/2818 , G10L15/14 , G10L15/183

摘要： Architecture that provides the integration of automatic speech recognition (ASR) and machine translation (MT) components of a full speech translation system. The architecture is an integrative and discriminative approach that employs an end-to-end objective function (the conditional probability of the translated sentence (target) given the source language's acoustic signal, as well as the associated BLEU score in the translation, as a goal in the integrated system. This goal defines the theoretically correct variables to determine the speech translation system output using a Bayesian decision rule. These theoretically correct variables are modified in practical use due to known imperfections of the various models used in building the full speech translation system. The disclosed approach also employs automatic training of these variables using minimum classification error (MCE) criterion. The measurable BLEU scores are used to facilitate the implementation of the MCE training procedure in a step that defines the class-specific discriminant function.

摘要翻译： 提供完整语音翻译系统的自动语音识别（ASR）和机器翻译（MT）组件的集成的架构。该架构是一种综合和歧视性的方法，采用端到端目标函数（给定源语言的声信号的翻译句子（目标）的条件概率）以及翻译中相关联的BLEU得分作为目标这个目标定义了理论上正确的变量来确定使用贝叶斯判决规则的语音翻译系统输出，这些理论上正确的变量在实际应用中被修改，这是由于建立全语音翻译系统中使用的各种模型的已知缺陷所公开的方法还采用最小分类误差（MCE）标准对这些变量进行自动训练，可测量的BLEU分数用于在定义特定类别判别函数的步骤中促进MCE训练过程的实现。

2.

发明申请
INTEGRATIVE AND DISCRIMINATIVE TECHNIQUE FOR SPOKEN UTTERANCE TRANSLATION 有权
标题翻译：一体化和辨别技术用于语音翻译

公开(公告)号：US20120143591A1

公开(公告)日：2012-06-07

申请号：US12957394

申请日：2010-12-01

申请人： Li Deng , Yaodong Zhang , Alejandro Acero , Xiaodong He

发明人： Li Deng , Yaodong Zhang , Alejandro Acero , Xiaodong He

IPC分类号： G06F17/28

CPC分类号： G06F17/2818 , G10L15/14 , G10L15/183

摘要： Architecture that provides the integration of automatic speech recognition (ASR) and machine translation (MT) components of a full speech translation system. The architecture is an integrative and discriminative approach that employs an end-to-end objective function (the conditional probability of the translated sentence (target) given the source language's acoustic signal, as well as the associated BLEU score in the translation, as a goal in the integrated system. This goal defines the theoretically correct variables to determine the speech translation system output using a Bayesian decision rule. These theoretically correct variables are modified in practical use due to known imperfections of the various models used in building the full speech translation system. The disclosed approach also employs automatic training of these variables using minimum classification error (MCE) criterion. The measurable BLEU scores are used to facilitate the implementation of the MCE training procedure in a step that defines the class-specific discriminant function.

摘要翻译： 提供完整语音翻译系统的自动语音识别（ASR）和机器翻译（MT）组件的集成的架构。该架构是一种综合和歧视性的方法，采用端到端目标函数（给定源语言的声信号的翻译句子（目标）的条件概率）以及翻译中相关联的BLEU得分作为目标这个目标定义了理论上正确的变量来确定使用贝叶斯判决规则的语音翻译系统输出，这些理论上正确的变量在实际应用中被修改，这是由于建立全语音翻译系统中使用的各种模型的已知缺陷所公开的方法还采用最小分类误差（MCE）标准对这些变量进行自动训练，可测量的BLEU分数用于在定义特定类别判别函数的步骤中促进MCE训练过程的实现。

3.

发明授权
Generic framework for large-margin MCE training in speech recognition 有权
标题翻译：语言识别中大面积MCE培训的通用框架

公开(公告)号：US08423364B2

公开(公告)日：2013-04-16

申请号：US11708440

申请日：2007-02-20

申请人： Dong Yu , Alejandro Acero , Li Deng , Xiaodong He

发明人： Dong Yu , Alejandro Acero , Li Deng , Xiaodong He

IPC分类号： G10L15/14 , G10L15/00 , G10L15/06

CPC分类号： G10L15/063 , G10L2015/0631

摘要： A method and apparatus for training an acoustic model are disclosed. A training corpus is accessed and converted into an initial acoustic model. Scores are calculated for a correct class and competitive classes, respectively, for each token given the initial acoustic model. Also, a sample-adaptive window bandwidth is calculated for each training token. From the calculated scores and the sample-adaptive window bandwidth values, loss values are calculated based on a loss function. The loss function, which may be derived from a Bayesian risk minimization viewpoint, can include a margin value that moves a decision boundary such that token-to-boundary distances for correct tokens that are near the decision boundary are maximized. The margin can either be a fixed margin or can vary monotonically as a function of algorithm iterations. The acoustic model is updated based on the calculated loss values. This process can be repeated until an empirical convergence is met.

摘要翻译： 公开了一种用于训练声学模型的方法和装置。训练语料库被访问并转换成初始声学模型。对于给定初始声学模型的每个令牌，分数计算分别为正确的类和竞争类。此外，针对每个训练令牌计算样本自适应窗口带宽。从计算出的分数和采样自适应窗口带宽值，根据损失函数计算损失值。可以从贝叶斯风险最小化观点导出的损失函数可以包括移动判定边界的边距值，使得靠近判定边界的正确令牌的令牌到边界的距离最大化。边距可以是固定边距，也可以作为算法迭代的函数单调变化。基于计算的损失值更新声学模型。可以重复该过程，直到满足经验收敛。

4.

发明申请
Generic framework for large-margin MCE training in speech recognition 有权
标题翻译：语言识别中大面积MCE培训的通用框架

公开(公告)号：US20080201139A1

公开(公告)日：2008-08-21

申请号：US11708440

申请日：2007-02-20

申请人： Dong Yu , Alejandro Acero , Li Deng , Xiaodong He

发明人： Dong Yu , Alejandro Acero , Li Deng , Xiaodong He

IPC分类号： G10L15/00

CPC分类号： G10L15/063 , G10L2015/0631

摘要： A method and apparatus for training an acoustic model are disclosed. A training corpus is accessed and converted into an initial acoustic model. Scores are calculated for a correct class and competitive classes, respectively, for each token given the initial acoustic model. Also, a sample-adaptive window bandwidth is calculated for each training token. From the calculated scores and the sample-adaptive window bandwidth values, loss values are calculated based on a loss function. The loss function, which may be derived from a Bayesian risk minimization viewpoint, can include a margin value that moves a decision boundary such that token-to-boundary distances for correct tokens that are near the decision boundary are maximized. The margin can either be a fixed margin or can vary monotonically as a function of algorithm iterations. The acoustic model is updated based on the calculated loss values. This process can be repeated until an empirical convergence is met.

摘要翻译： 公开了一种用于训练声学模型的方法和装置。训练语料库被访问并转换成初始声学模型。对于给定初始声学模型的每个令牌，分数计算分别为正确的类和竞争类。此外，针对每个训练令牌计算样本自适应窗口带宽。从计算出的分数和采样自适应窗口带宽值，根据损失函数计算损失值。可以从贝叶斯风险最小化观点导出的损失函数可以包括移动判定边界的边距值，使得靠近判定边界的正确令牌的令牌到边界的距离最大化。边距可以是固定边距，也可以作为算法迭代的函数单调变化。基于计算的损失值更新声学模型。可以重复该过程，直到满足经验收敛。

5.

发明申请
USING COMBINED ANSWERS IN MACHINE-BASED EDUCATION 审中-公开
标题翻译：在基于机器的教育中使用组合回答

公开(公告)号：US20100311030A1

公开(公告)日：2010-12-09

申请号：US12477138

申请日：2009-06-03

申请人： Xiaodong He , Alejandro Acero , Sebastian de la Chica

发明人： Xiaodong He , Alejandro Acero , Sebastian de la Chica

IPC分类号： G09B3/00

CPC分类号： G09B7/02

摘要： Described is a technology for learning a foreign language or other subject. Answers (e.g., translations) to questions (e.g., sentences to translate) received from learners are combined into a combined answer that serves as a representative model answer for those learners. The questions also may be provided to machine subsystems to generate machine answers, e.g., machine translators, with those machine answers used in the combined answer. The combined answer is used to evaluate each learner's individual answer. The evaluation may be used to compute profile information that is then fed back for use in selecting further questions, e.g., more difficult sentences as the learners progress. Also described is integrating the platform/technology into a web service.

摘要翻译： 描述了一种学习外语或其他科目的技术。将从学习者接收到的问题（例如，翻译）的问题（例如，要翻译的句子）组合成为用于那些学习者的代表性模型答案的组合答案。也可以将这些问题提供给机器子系统，以便在组合的答案中使用这些机器答案来产生机器答案，例如机器翻译器。组合的答案用于评估每个学习者的个人答案。评估可以用于计算简档信息，然后将其反馈用于选择进一步的问题，例如学习者进步时更难的句子。还描述了将平台/技术集成到Web服务中。

6.

发明授权
Phase sensitive model adaptation for noisy speech recognition 有权
标题翻译：嘈杂语音识别的相敏模型适应

公开(公告)号：US08214215B2

公开(公告)日：2012-07-03

申请号：US12236530

申请日：2008-09-24

申请人： Jinyu Li , Li Deng , Dong Yu , Yifan Gong , Alejandro Acero

发明人： Jinyu Li , Li Deng , Dong Yu , Yifan Gong , Alejandro Acero

IPC分类号： G10L15/14

CPC分类号： G10L15/065 , G10L15/20

摘要： A speech recognition system described herein includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an updater component that is in communication with a first model and a second model, wherein the updater component automatically updates parameters of the second model based at least in part upon joint estimates of additive and convolutive distortions output by the first model, wherein the joint estimates of additive and convolutive distortions are estimates of distortions based on a phase-sensitive model in the speech utterance received by the receiver component. Further, distortions other than additive and convolutive distortions, including other stationary and nonstationary sources, can also be estimated used to update the parameters of the second model.

摘要翻译： 本文描述的语音识别系统包括接收失真的语音话语的接收机组件。所述语音识别还包括与第一模型和第二模型通信的更新器组件，其中所述更新器组件至少部分地基于由所述第一模型输出的加法和卷积失真的联合估计来自动更新所述第二模型的参数其中，加法和卷积失真的联合估计是基于由接收器部件接收的语音发声中的相敏模型的失真估计。此外，还可以估计用于更新第二模型参数的除加法和卷积失真之外的失真，包括其他静止和非平稳源。

7.

发明授权
Piecewise-based variable-parameter Hidden Markov Models and the training thereof 有权
标题翻译：基于分段的可变参数隐马尔科夫模型及其训练

公开(公告)号：US08160878B2

公开(公告)日：2012-04-17

申请号：US12211114

申请日：2008-09-16

申请人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero

发明人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero

IPC分类号： G10L15/14 , G10L15/20

CPC分类号： G10L15/144

摘要： A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech under many different conditions. Each Gaussian mixture component of the VPHMMs is characterized by a mean parameter μ and a variance parameter Σ. Each of these Gaussian parameters varies as a function of at least one environmental conditioning parameter, such as, but not limited to, instantaneous signal-to-noise-ratio (SNR). The way in which a Gaussian parameter varies with the environmental conditioning parameter(s) can be approximated as a piecewise function, such as a cubic spline function. Further, the recognition system formulates the mean parameter μ and the variance parameter Σ of each Gaussian mixture component in an efficient form that accommodates the use of discriminative training and parameter sharing. Parameter sharing is carried out so that the otherwise very large number of parameters in the VPHMMs can be effectively reduced with practically feasible amounts of training data.

摘要翻译： 语音识别系统使用高斯混合可变参数隐马尔可夫模型（VPHMM）来识别许多不同条件下的语音。 VPHMM的每个高斯混合分量的特征在于平均参数μ和方差参数＆Sgr。这些高斯参数中的每一个作为至少一个环境调节参数的函数而变化，例如但不限于瞬时信噪比（SNR）。高斯参数随环境条件参数变化的方式可以近似为分段函数，如三次样条函数。此外，识别系统制定均值参数μ和方差参数＆Sgr; 每个高斯混合分量以有效的形式适应使用歧视性训练和参数共享。执行参数共享，以便通过实际可行的训练数据量可以有效地减少VPHMM中非常大量的参数。

8.

发明授权
Parameter clustering and sharing for variable-parameter hidden markov models 有权
标题翻译：可变参数隐马尔可夫模型的参数聚类和共享

公开(公告)号：US08145488B2

公开(公告)日：2012-03-27

申请号：US12211115

申请日：2008-09-16

申请人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero

发明人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero

IPC分类号： G10L15/14

CPC分类号： G10L15/142

摘要： A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech. The VPHMMs include Gaussian parameters that vary as a function of at least one environmental conditioning parameter. The relationship of each Gaussian parameter to the environmental conditioning parameter(s) is modeled using a piecewise fitting approach, such as by using spline functions. In a training phase, the recognition system can use clustering to identify classes of spline functions, each class grouping together spline functions which are similar to each other based on some distance measure. The recognition system can then store sets of spline parameters that represent respective classes of spline functions. An instance of a spline function that belongs to a class can make reference to an associated shared set of spline parameters. The Gaussian parameters can be represented in an efficient form that accommodates the use of sharing in the above-summarized manner.

摘要翻译： 语音识别系统使用高斯混合可变参数隐马尔可夫模型（VPHMM）来识别语音。 VPHMM包括作为至少一个环境调节参数的函数而变化的高斯参数。每个高斯参数与环境条件参数的关系使用分段拟合方法建模，例如通过使用样条函数。在训练阶段，识别系统可以使用聚类来识别样条函数的类别，每个类别根据一些距离度量将彼此相似的样条函数分组在一起。识别系统然后可以存储表示各种样条函数的样条参数集合。属于类的样条函数的一个实例可以引用相关联的一组样条参数。高斯参数可以以适合以上述方式共享使用的有效形式来表示。

9.

发明授权
Time asynchronous decoding for long-span trajectory model 失效
标题翻译：用于长跨度轨迹模型的时间异步解码

公开(公告)号：US07734460B2

公开(公告)日：2010-06-08

申请号：US11311951

申请日：2005-12-20

申请人： Dong Yu , Li Deng , Alejandro Acero

发明人： Dong Yu , Li Deng , Alejandro Acero

IPC分类号： G06F17/21 , G06F17/27 , G10L15/00

CPC分类号： G10L15/08 , G10L15/187

摘要： A time-asynchronous lattice-constrained search algorithm is developed and used to process a linguistic model of speech that has a long-contextual-span capability. In the algorithm, nodes and links in the lattices developed from the model are expanded via look-ahead. Heuristics as utilized by a search algorithm are estimated. Additionally, pruning strategies can be applied to speed up the search.

摘要翻译： 开发了时间异步网格约束搜索算法，用于处理具有长语境跨度能力的语言语言模型。在算法中，从模型开发的网格中的节点和链接通过预先扩展。估计搜索算法使用的启发式算法。此外，可以应用修剪策略来加快搜索速度。

10.

发明授权
Method and apparatus for constructing a speech filter using estimates of clean speech and noise 有权
标题翻译：用于使用干净的语音和噪声的估计来构造语音滤波器的方法和装置

公开(公告)号：US07725314B2

公开(公告)日：2010-05-25

申请号：US10780177

申请日：2004-02-16

申请人： Jian Wu , James G. Droppo , Li Deng , Alejandro Acero

发明人： Jian Wu , James G. Droppo , Li Deng , Alejandro Acero

IPC分类号： G10L15/20 , G10L21/02 , G10L15/00 , H04B15/00

CPC分类号： G10L21/0208

摘要： A method and apparatus identify a clean speech signal from a noisy speech signal. To do this, a clean speech value and a noise value are estimated from the noisy speech signal. The clean speech value and the noise value are then used to define a gain on a filter. The noisy speech signal is applied to the filter to produce the clean speech signal. Under some embodiments, the noise value and the clean speech value are used in both the numerator and the denominator of the filter gain, with the numerator being guaranteed to be positive.

摘要翻译： 方法和装置从噪声语音信号中识别干净的语音信号。为此，从噪声语音信号估计干净的语音值和噪声值。然后使用干净的语音值和噪声值来定义滤波器上的增益。噪声语音信号被施加到滤波器以产生干净的语音信号。在一些实施例中，噪声值和清洁语音值用于滤波器增益的分子和分母，分子保证为正。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类