专利检索 ap:("Alejandro Acero" OR "James Garnet Droppo, III" OR "Xiaoqiang Xiao" OR "Geoffrey G. Zweig") AND inv:"Alejandro Acero" 第 8 页

71.

发明申请
MAXIMUM ENTROPY MODEL WITH CONTINUOUS FEATURES 审中-公开
标题翻译：具有连续特征的最大熵模型

公开(公告)号：US20100256977A1

公开(公告)日：2010-10-07

申请号：US12416161

申请日：2009-04-01

申请人： Dong Yu , Li Deng , Alejandro Acero

发明人： Dong Yu , Li Deng , Alejandro Acero

IPC分类号： G10L15/00

CPC分类号： G10L15/14 , G06K9/6277 , G06K9/6297

摘要： Described is a technology by which a maximum entropy (MaxEnt) model, such as used as a classifier or in a conditional random field or hidden conditional random field that embed the maximum entropy model, uses continuous features with continuous weights that are continuous functions of the feature values (instead of single-valued weights). The continuous weights may be approximated by a spline-based solution. In general, this converts the optimization problem into a standard log-linear optimization problem without continuous weights at a higher-dimensional space.

摘要翻译： 描述了最大熵（MaxEnt）模型，例如用作分类器或嵌入最大熵模型的条件随机场或隐藏条件随机场的最大熵（MaxEnt）模型使用具有连续权重的连续特征，连续权重是连续权重，特征值（而不是单值权重）。连续权重可以通过基于样条的解决方案近似。一般来说，这将优化问题转化为标准的对数线性优化问题，而在较高维度的空间则没有连续权重。

72.

发明申请
AUDIO TRANSFORMS IN CONNECTION WITH MULTIPARTY COMMUNICATION 有权
标题翻译：与多媒体通信相关的音频转换

公开(公告)号：US20100195812A1

公开(公告)日：2010-08-05

申请号：US12365949

申请日：2009-02-05

申请人： Dinei A. Florencio , Alejandro Acero , William Buxton , Phillip A. Chou , Ross G. Cutler , Jason Garms , Christian Huitema , Kori M. Quinn , Daniel Allen Rosenfeld , Zhengyou Zhang

发明人： Dinei A. Florencio , Alejandro Acero , William Buxton , Phillip A. Chou , Ross G. Cutler , Jason Garms , Christian Huitema , Kori M. Quinn , Daniel Allen Rosenfeld , Zhengyou Zhang

IPC分类号： H04M3/42 , G10L11/00

CPC分类号： H04M3/56 , G10L2021/0135 , H04M3/565

摘要： The claimed subject matter relates to an architecture that can preprocess audio portions of communications in order to enrich multiparty communication sessions or environments. In particular, the architecture can provide both a public channel for public communications that are received by substantially all connected parties and can further provide a private channel for private communications that are received by a selected subset of all connected parties. Most particularly, the architecture can apply an audio transform to communications that occur during the multiparty communication session based upon a target audience of the communication. By way of illustration, the architecture can apply a whisper transform to private communications, an emotion transform based upon relationships, an ambience or spatial transform based upon physical locations, or a pace transform based upon lack of presence.

摘要翻译： 所要求保护的主题涉及可以预处理通信的音频部分以便丰富多方通信会话或环境的架构。特别地，该架构可以提供公共通信的公共信道，其由基本上所有连接的各方接收，并且可以进一步提供由所有连接方的所选子集接收的专用通信的专用信道。特别地，架构可以基于通信的目标受众对音频转换应用于在多方通信会话期间发生的通信。作为说明，架构可以对私人通信应用耳语转换，基于关系，基于物理位置的氛围或空间变换或基于缺乏存在的步调变换的情感变换。

73.

发明授权
Method of pattern recognition using noise reduction uncertainty 有权
标题翻译：使用降噪不确定度的模式识别方法

公开(公告)号：US07769582B2

公开(公告)日：2010-08-03

申请号：US12180260

申请日：2008-07-25

申请人： James G. Droppo , Alejandro Acero , Li Deng

发明人： James G. Droppo , Alejandro Acero , Li Deng

IPC分类号： G10L15/20 , G10L21/02 , G10L15/14

CPC分类号： G10L21/0208 , G10L15/20

摘要： A method and apparatus are provided for using the uncertainty of a noise-removal process during pattern recognition. In particular, noise is removed from a representation of a portion of a noisy signal to produce a representation of a cleaned signal. In the meantime, an uncertainty associated with the noise removal is computed and is used with the representation of the cleaned signal to modify a probability for a phonetic state in the recognition system. In particular embodiments, the uncertainty is used to modify a probability distribution, by increasing the variance in each Gaussian distribution by the amount equal to the estimated variance of the cleaned signal, which is used in decoding the phonetic state sequence in a pattern recognition task.

摘要翻译： 提供了一种在模式识别期间使用噪声去除处理的不确定性的方法和装置。特别地，从噪声信号的一部分的表示中去除噪声以产生清洁信号的表示。同时，计算与噪声去除有关的不确定性，并与清除信号的表示一起使用以修改识别系统中语音状态的概率。在特定实施例中，不确定性用于通过将每个高斯分布中的方差增加等于在模式识别任务中对语音状态序列进行解码所使用的清除信号的估计方差的量来修改概率分布。

74.

发明授权
Updating hidden conditional random field model parameters after processing individual training samples 有权
标题翻译：在处理个人培训样本后更新隐藏的条件随机场模型参数

公开(公告)号：US07689419B2

公开(公告)日：2010-03-30

申请号：US11233148

申请日：2005-09-22

申请人： Milind V. Mahajan , Alejandro Acero , Asela J. Gunawardana , John C. Platt

发明人： Milind V. Mahajan , Alejandro Acero , Asela J. Gunawardana , John C. Platt

IPC分类号： G10L15/00

CPC分类号： G10L15/063

摘要： A method and apparatus are provided for training parameters in a hidden conditional random field model for use in speech recognition and phonetic classification. The hidden conditional random field model uses parameterized features that are determined from a segment of speech, and those values are used to identify a phonetic unit for the segment of speech. The parameters are updated after processing of individual training samples.

摘要翻译： 提供了一种用于训练用于语音识别和语音分类的隐藏条件随机场模型中的参数的方法和装置。隐藏条件随机场模型使用从语音段确定的参数化特征，并且这些值用于识别语音段的语音单元。参数在处理单个培训样本后更新。

75.

发明申请
PHASE SENSITIVE MODEL ADAPTATION FOR NOISY SPEECH RECOGNITION 有权
标题翻译：语音识别的相敏感模型适应

公开(公告)号：US20100076758A1

公开(公告)日：2010-03-25

申请号：US12236530

申请日：2008-09-24

申请人： Jinyu Li , Li Deng , Dong Yu , Yifan Gong , Alejandro Acero

发明人： Jinyu Li , Li Deng , Dong Yu , Yifan Gong , Alejandro Acero

IPC分类号： G10L15/20 , G10L15/14

CPC分类号： G10L15/065 , G10L15/20

摘要： A speech recognition system described herein includes a receiver component that receives a distorted speech utterance. The speech recognition also includes an updater component that is in communication with a first model and a second model, wherein the updater component automatically updates parameters of the second model based at least in part upon joint estimates of additive and convolutive distortions output by the first model, wherein the joint estimates of additive and convolutive distortions are estimates of distortions based on a phase-sensitive model in the speech utterance received by the receiver component. Further, distortions other than additive and convolutive distortions, including other stationary and nonstationary sources, can also be estimated used to update the parameters of the second model.

摘要翻译： 本文描述的语音识别系统包括接收失真的语音话语的接收机组件。所述语音识别还包括与第一模型和第二模型通信的更新器组件，其中所述更新器组件至少部分地基于由所述第一模型输出的加法和卷积失真的联合估计来自动更新所述第二模型的参数其中，加法和卷积失真的联合估计是基于由接收器部件接收的语音发声中的相敏模型的失真估计。此外，还可以估计用于更新第二模型参数的除加法和卷积失真之外的失真，包括其他静止和非平稳源。

76.

发明申请
PARAMETER CLUSTERING AND SHARING FOR VARIABLE-PARAMETER HIDDEN MARKOV MODELS 有权
标题翻译：参数聚类和共享可变参数隐藏式MARKOV模型

公开(公告)号：US20100070280A1

公开(公告)日：2010-03-18

申请号：US12211115

申请日：2008-09-16

申请人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero

发明人： Dong Yu , Li Deng , Yifan Gong , Alejandro Acero

IPC分类号： G10L15/14

CPC分类号： G10L15/142

摘要： A speech recognition system uses Gaussian mixture variable-parameter hidden Markov models (VPHMMs) to recognize speech. The VPHMMs include Gaussian parameters that vary as a function of at least one environmental conditioning parameter. The relationship of each Gaussian parameter to the environmental conditioning parameter(s) is modeled using a piecewise fitting approach, such as by using spline functions. In a training phase, the recognition system can use clustering to identify classes of spline functions, each class grouping together spline functions which are similar to each other based on some distance measure. The recognition system can then store sets of spline parameters that represent respective classes of spline functions. An instance of a spline function that belongs to a class can make reference to an associated shared set of spline parameters. The Gaussian parameters can be represented in an efficient form that accommodates the use of sharing in the above-summarized manner.

摘要翻译： 语音识别系统使用高斯混合可变参数隐马尔可夫模型（VPHMM）来识别语音。 VPHMM包括作为至少一个环境调节参数的函数而变化的高斯参数。每个高斯参数与环境条件参数的关系使用分段拟合方法建模，例如通过使用样条函数。在训练阶段，识别系统可以使用聚类来识别样条函数的类别，每个类别根据一些距离度量将彼此相似的样条函数分组在一起。识别系统然后可以存储表示各种样条函数的样条参数集合。属于类的样条函数的一个实例可以引用相关联的一组样条参数。高斯参数可以以适合以上述方式共享使用的有效形式来表示。

77.

发明授权
Multi-sensory speech enhancement using a speech-state model 有权
标题翻译：使用语态模型的多感官语音增强

公开(公告)号：US07680656B2

公开(公告)日：2010-03-16

申请号：US11168770

申请日：2005-06-28

申请人： Zhengyou Zhang , Zicheng Liu , Alejandro Acero , Amarnag Subramanya , James G. Droppo

发明人： Zhengyou Zhang , Zicheng Liu , Alejandro Acero , Amarnag Subramanya , James G. Droppo

IPC分类号： G10L15/00 , G10L15/20

CPC分类号： G10L21/0208 , G10L2021/02165

摘要： A method and apparatus determine a likelihood of a speech state based on an alternative sensor signal and an air conduction microphone signal. The likelihood of the speech state is used, together with the alternative sensor signal and the air conduction microphone signal, to estimate a clean speech value for a clean speech signal.

摘要翻译： 方法和装置基于替代传感器信号和空气传导麦克风信号确定语音状态的可能性。使用语音状态的可能性以及替代的传感器信号和导气麦克风信号来估计干净的语音信号的清晰的语音值。

78.

发明授权
Method of automatically ranking speech dialog states and transitions to aid in performance analysis in speech applications 有权
标题翻译：自动排序语音对话状态和转换的方法，以帮助语音应用中的性能分析

公开(公告)号：US07643995B2

公开(公告)日：2010-01-05

申请号：US11054096

申请日：2005-02-09

申请人： Alejandro Acero , Dong Yu

发明人： Alejandro Acero , Dong Yu

IPC分类号： G10L15/00

CPC分类号： G10L15/01 , G10L15/083

摘要： A method of identifying problems in a speech recognition application is provided and includes the step of obtaining a speech application call log containing log data on question-answer (QA) states and transitions. Then, in accordance with the method, for each of a multiple transitions between states, a parameter is generated which is indicative of a gain in a success rate of the speech recognition application if all calls passing through the transition passed instead through other transitions. In exemplary embodiments, the parameter is an Arc Cut Gain in Success Rate (ACGSR) parameter. Methods of generating the ACGSR, as well as systems and tools for aiding developers are also disclosed.

摘要翻译： 提供了一种识别语音识别应用中的问题的方法，并且包括获得包含问答（QA）状态和转换的日志数据的语音应用呼叫日志的步骤。然后，根据该方法，对于状态之间的多个转换中的每一个，生成指示如果通过转换的所有呼叫通过其他转换而通过语音识别应用的成功率的增益的参数。在示例性实施例中，参数是成功率弧度增益（ACGSR）参数。还公开了生成ACGSR的方法，以及用于帮助开发人员的系统和工具。

79.

发明授权
Method and apparatus for indexing speech 有权
标题翻译：索引语音的方法和装置

公开(公告)号：US07634407B2

公开(公告)日：2009-12-15

申请号：US11133515

申请日：2005-05-20

申请人： Ciprian I. Chelba , Alejandro Acero

发明人： Ciprian I. Chelba , Alejandro Acero

IPC分类号： G10L15/00

CPC分类号： G10L15/26

摘要： A method of indexing a speech segment includes identifying at least two alternative word sequences based on the speech segment. For each word in the alternative sequences, information is placed in an entry for the word in the index. The information indicates the position of the word in at least one of the alternative sequences.

摘要翻译： 索引语音片段的方法包括基于语音片段识别至少两个替代的字序列。对于替代序列中的每个单词，信息被放置在索引中的单词的条目中。该信息表示在至少一个替代序列中的单词的位置。

80.

发明授权
Hidden conditional random field models for phonetic classification and speech recognition 有权
标题翻译：用于语音分类和语音识别的隐藏条件随机场模型

公开(公告)号：US07627473B2

公开(公告)日：2009-12-01

申请号：US10966047

申请日：2004-10-15

申请人： Asela J. Gunawardana , Milind Mahajan , Alejandro Acero

发明人： Asela J. Gunawardana , Milind Mahajan , Alejandro Acero

IPC分类号： G10L17/00 , G10L15/14

CPC分类号： G10L15/14

摘要： A method and apparatus are provided for training and using a hidden conditional random field model for speech recognition and phonetic classification. The hidden conditional random field model uses feature functions, at least one of which is based on a hidden state in a phonetic unit. Values for the feature functions are determined from a segment of speech, and these values are used to identify a phonetic unit for the segment of speech.

摘要翻译： 提供了一种用于训练和使用用于语音识别和语音分类的隐藏条件随机场模型的方法和装置。隐藏条件随机场模型使用特征函数，其中至少一个基于语音单元中的隐藏状态。特征函数的值由语音段确定，并且这些值用于识别语音段的语音单元。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类