专利检索 ap:("Dong Yu" OR "Alejandro Acero" OR "Yun-Cheng Ju" OR "Ye-Yi Wang") AND inv:"Alejandro Acero" 第 7 页

61.

发明申请
ADAPTING A COMPRESSED MODEL FOR USE IN SPEECH RECOGNITION 有权
标题翻译：适应用于语音识别的压缩模型

公开(公告)号：US20100076757A1

公开(公告)日：2010-03-25

申请号：US12235748

申请日：2008-09-23

申请人： Jinyu Li , Li Deng , Dong Yu , Jian Wu , Yifan Gong , Alejandro Acero

发明人： Jinyu Li , Li Deng , Dong Yu , Jian Wu , Yifan Gong , Alejandro Acero

IPC分类号： G10L15/20

CPC分类号： G10L15/20 , G10L15/065

摘要： A speech recognition system includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an adaptor component that selectively adapts parameters of a compressed model used to recognize at least a portion of the distorted speech utterance, wherein the adaptor component selectively adapts the parameters of the compressed model based at least in part upon the received distorted speech utterance.

摘要翻译： 语音识别系统包括接收失真的语音话语的接收机组件。所述语音识别还包括适配器组件，所述适配器组件选择性地适配用于识别所述失真语音话语的至少一部分的压缩模型的参数，其中所述适配器组件至少部分地基于接收失真的语音话语选择性地调整所述压缩模型的参数讲话话语。

62.

发明授权
System and method for identifying semantic intent from acoustic information 有权
标题翻译：用于从声学信息中识别语义意图的系统和方法

公开(公告)号：US07634406B2

公开(公告)日：2009-12-15

申请号：US11009630

申请日：2004-12-10

申请人： Xiao Li , Asela J. Gunawardana , Alejandro Acero , Milind Mahajan , Dong Yu

发明人： Xiao Li , Asela J. Gunawardana , Alejandro Acero , Milind Mahajan , Dong Yu

IPC分类号： G10L15/06

CPC分类号： G10L15/19 , G10L15/1815

摘要： In accordance with one embodiment of the present invention, unanticipated semantic intents are discovered in audio data in an unsupervised manner. For instance, the audio acoustics are clustered based on semantic intent and representative acoustics are chosen for each cluster. The human then need only listen to a small number of representative acoustics for each cluster (and possibly only one per cluster) in order to identify the unforeseen semantic intents.

摘要翻译： 根据本发明的一个实施例，以无监督的方式在音频数据中发现意外的语义意图。例如，音频声学基于语义意图进行聚类，并为每个群集选择代表性的声学。然后，人们只需要听每个群集的少量代表性声学（并且可能只有一个群集），以便识别不可预见的语义意图。

63.

发明授权
Acoustic models with structured hidden dynamics with integration over many possible hidden trajectories 有权
标题翻译：具有结构化隐藏动力学的声学模型，并集成了许多可能的隐藏轨迹

公开(公告)号：US07565284B2

公开(公告)日：2009-07-21

申请号：US11071904

申请日：2005-03-01

申请人： Li Deng , Alejandro Acero , Dong Yu , Xiang Li

发明人： Li Deng , Alejandro Acero , Dong Yu , Xiang Li

IPC分类号： G10L15/14

CPC分类号： G10L15/02 , G10L2015/025

摘要： A method of producing at least one possible sequence of vocal tract resonance (VTR) for a fixed sequence of phonetic units, and producing the acoustic observation probability by integrating over such distributions is provided. The method includes identifying a sequence of target distributions for a VTR sequence corresponding to a phone sequence with a given segmentation. The sequence of target distributions is applied to a finite impulse response filter to produce distributions for possible VTR trajectories. Then these distributions are applied to a linearized nonlinear function to produce the acoustic observation probability for the given sequence of phonetic units. This acoustic observation probability is used for phonetic recognition.

摘要翻译： 提供了一种用于固定语音单元序列的至少一个可能的声道共振（VTR）序列的方法，并且通过在这样的分布上积分来产生声学观察概率。该方法包括识别对应于具有给定分割的电话序列的VTR序列的目标分布序列。将目标分布的序列应用于有限脉冲响应滤波器，以产生可能的VTR轨迹的分布。然后将这些分布应用于线性化非线性函数，以产生给定的语音单元序列的声学观察概率。这种声学观测概率用于语音识别。

64.

发明授权
System and method for user modeling to enhance named entity recognition 有权
标题翻译：用户建模的系统和方法来增强命名实体的识别

公开(公告)号：US07289956B2

公开(公告)日：2007-10-30

申请号：US10445532

申请日：2003-05-27

申请人： Dong Yu , Peter K. L. Mau , Kuansan Wang , Milind Mahajan , Alejandro Acero

发明人： Dong Yu , Peter K. L. Mau , Kuansan Wang , Milind Mahajan , Alejandro Acero

IPC分类号： G10L15/14

CPC分类号： G06F17/278

摘要： The present invention employs user modeling to model a user's behavior patterns. The user's behavior patterns are then used to influence named entity (NE) recognition.

摘要翻译： 本发明采用用户建模来模拟用户的行为模式。然后用户的行为模式用于影响命名实体（NE）识别。

65.

发明申请
Two-stage implementation for phonetic recognition using a bi-directional target-filtering model of speech coarticulation and reduction 有权
标题翻译：使用语音合成和还原的双向目标滤波模型进行语音识别的两阶段实现

公开(公告)号：US20060200351A1

公开(公告)日：2006-09-07

申请号：US11069474

申请日：2005-03-01

申请人： Alejandro Acero , Dong Yu , Li Deng

发明人： Alejandro Acero , Dong Yu , Li Deng

IPC分类号： G10L15/04

CPC分类号： G10L15/02 , G10L25/15 , G10L25/24 , G10L2015/025

摘要： A structured generative model of a speech coarticulation and reduction is described with a novel two-stage implementation. At the first stage, the dynamics of formants or vocal tract resonance (VTR) are generated using prior information of resonance targets in the phone sequence. Bi-directional temporal filtering with finite impulse response (FIR) is applied to the segmental target sequence as the FIR filter's input. At the second stage the dynamics of speech cepstra are predicted analytically based on the FIR filtered VTR targets. The combined system of these two stages thus generates correlated and causally related VTR and cepstral dynamics where phonetic reduction is represented explicitly in the hidden resonance space and implicitly in the observed cepstral space. The combined system also gives the acoustic observation probability given a phone sequence. Using this probability, different phone sequences can be compared and ranked in terms of their respective probability values. This then permits the use of the model for phonetic recognition.

摘要翻译： 用新的两阶段实现来描述语音合成和简化的结构化生成模型。在第一阶段，使用电话序列中共振目标的先前信息产生共振峰或声道共振（VTR）的动力学。具有有限脉冲响应（FIR）的双向时间滤波作为FIR滤波器的输入应用于分段目标序列。在第二阶段，基于FIR滤波的VTR目标，分析地预测语音cepstra的动力学。这两个阶段的组合系统因此产生相关和因果相关的VTR和倒谱动力学，其中语音减少在隐藏共振空间中明确表示，并且隐含地在观察到的倒频谱空间中。组合系统还给出了电话序列的声学观察概率。使用这种概率，可以根据它们各自的概率值对不同的电话序列进行比较和排序。这样就允许使用模型进行语音识别。

66.

发明申请
Interactive clustering method for identifying problems in speech applications 有权
标题翻译：用于识别语音应用中的问题的交互式聚类方法

公开(公告)号：US20060178884A1

公开(公告)日：2006-08-10

申请号：US11054301

申请日：2005-02-09

申请人： Alejandro Acero , Dong Yu

发明人： Alejandro Acero , Dong Yu

IPC分类号： G10L15/06

CPC分类号： G06F17/30707 , G06F17/2785 , G10L15/26

摘要： A method of aiding a speech recognition program developer by grouping calls passing through an identified question-answer (QA) state or transition into clusters based on causes of problems associated with the calls is provided. The method includes determining a number of clusters into which a plurality of calls will be grouped. Then, the plurality of calls is at least partially randomly assigned to the different clusters. Model parameters are estimated using clustering information based upon the assignment of the plurality of calls to the different clusters. Individual probabilities are calculated for each of the plurality of calls using the estimated model parameters. The individual probabilities are indicative of a likelihood that the corresponding call belongs to a particular cluster. The plurality of calls is then re-assigned to the different clusters based upon the calculated probabilities. These steps are then repeated until the grouping of the plurality of calls achieves a desired stability.

摘要翻译： 提供了一种通过将通过识别的问答（QA）状态的呼叫或基于与呼叫相关联的问题的原因转换成群集的呼叫来帮助语音识别程序开发者的方法。该方法包括确定将多个呼叫分组到的群集的数量。然后，多个呼叫至少部分地被随机分配给不同的群集。基于对不同簇的多个呼叫的分配，使用聚类信息估计模型参数。使用估计的模型参数为多个呼叫中的每一个计算单个概率。单个概率表示相应呼叫属于特定集群的可能性。然后，基于所计算的概率，将多个呼叫重新分配给不同的群集。然后重复这些步骤直到多个呼叫的分组达到期望的稳定性。

67.

发明申请
Classification filter for processing data for creating a language model 有权
标题翻译：用于处理用于创建语言模型的数据的分类过滤器

公开(公告)号：US20060178869A1

公开(公告)日：2006-08-10

申请号：US11054819

申请日：2005-02-10

申请人： Alejandro Acero , Dong Yu , Julian Odell , Milind Mahajan , Peter Mau

发明人： Alejandro Acero , Dong Yu , Julian Odell , Milind Mahajan , Peter Mau

IPC分类号： G06F17/21

CPC分类号： G06F17/2715 , G06F17/277 , G10L15/063 , G10L15/18 , G10L15/183

摘要： The method and apparatus utilize a filter to remove a variety of non-dictated words from data based on probability and improve the effectiveness of creating a language model.

摘要翻译： 该方法和装置利用滤波器基于概率从数据中去除各种非指令词，并提高创建语言模型的有效性。

68.

发明申请
Quantitative model for formant dynamics and contextually assimilated reduction in fluent speech 有权

公开(公告)号：US20060074676A1

公开(公告)日：2006-04-06

申请号：US10944262

申请日：2004-09-17

申请人： Li Deng , Alejandro Acero , Dong Yu

发明人： Li Deng , Alejandro Acero , Dong Yu

IPC分类号： G10L13/04

CPC分类号： G10L13/02 , G10L25/15

摘要： A method of identifying a sequence of formant trajectory values is provided in which a sequence of target values are identified for a formant as step functions. The target values and the duration for each segment target for the formant are applied to a finite impulse response filter to form a sequence of formant trajectory values. The parameters of this filter, as well as the duration of the targets for each phone, can be modified to produce many kinds of target undershooting effects in a contextually assimilated manner. The procedure for producing the formant trajectory values does not require any acoustic data from speech.

69.

发明授权
Parameter learning in a hidden trajectory model 有权
标题翻译：隐藏轨迹模型中的参数学习

公开(公告)号：US08942978B2

公开(公告)日：2015-01-27

申请号：US13182971

申请日：2011-07-14

申请人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero

发明人： Li Deng , Dong Yu , Xiaolong Li , Alejandro Acero

IPC分类号： G10L15/00 , G10L15/06

CPC分类号： G10L15/063 , G10L2015/025

摘要： Parameters for distributions of a hidden trajectory model including means and variances are estimated using an acoustic likelihood function for observation vectors as an objection function for optimization. The estimation includes only acoustic data and not any intermediate estimate on hidden dynamic variables. Gradient ascent methods can be developed for optimizing the acoustic likelihood function.

摘要翻译： 使用用于观察向量的声学似然函数作为优化的反对函数来估计包括装置和方差的隐藏轨迹模型的分布参数。该估计仅包括声学数据，而不包括对隐藏的动态变量的任何中间估计。可以开发梯度上升方法来优化声似然函数。

70.

发明授权
Automatic speech recognition learning using categorization and selective incorporation of user-initiated corrections 有权
标题翻译：自动语音识别学习使用分类和选择性并入用户发起的更正

公开(公告)号：US08280733B2

公开(公告)日：2012-10-02

申请号：US12884434

申请日：2010-09-17

申请人： Dong Yu , Peter Mau , Mei-Yuh Hwang , Alejandro Acero

发明人： Dong Yu , Peter Mau , Mei-Yuh Hwang , Alejandro Acero

IPC分类号： G10L15/00 , G10L15/06 , G10L15/04 , G10L15/14 , G10L21/00

CPC分类号： G10L15/065 , G10L15/063 , G10L2015/0631

摘要： An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

摘要翻译： 自动语音识别系统识别用户对规定文本的改变，并且推测这种改变是否由用户改变主意而产生，或者这些改变是否是识别错误的结果。如果检测到识别错误，则系统使用用户校正的类型进行自身修改，以减少再次发生这种识别错误的可能性。因此，该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类