专利检索 ap:("Milind Mahajan" OR "Patrick Nguyen" OR "Alejandro Acero") AND inv:"Milind Mahajan" 第 1 页

1.

发明授权
Using automated content analysis for audio/video content consumption 有权
标题翻译：使用音频/视频内容消费的自动内容分析

公开(公告)号：US07640272B2

公开(公告)日：2009-12-29

申请号：US11635153

申请日：2006-12-07

申请人： Milind Mahajan , Patrick Nguyen , Alejandro Acero

发明人： Milind Mahajan , Patrick Nguyen , Alejandro Acero

IPC分类号： G06F7/00 , G06F17/00

CPC分类号： G10L15/18 , G06F17/30787 , G06F17/30796 , G06F17/30852 , Y10S707/99945 , Y10S707/99948

摘要： Audio/video (A/V) content is analyzed using speech and language analysis components. Metadata is automatically generated based upon the analysis. The metadata is used in generating user interface interaction components which allow a user to view subject matter in various segments of the A/V content and to interact with the A/V content based on the automatically generated metadata.

摘要翻译： 使用语音和语言分析组件分析音频/视频（A / V）内容。基于分析自动生成元数据。元数据用于生成用户界面交互组件，其允许用户在A / V内容的各个段中查看主题，并且基于自动生成的元数据与A / V内容进行交互。

2.

发明申请
Using automated content analysis for audio/video content consumption 有权
标题翻译：使用音频/视频内容消费的自动内容分析

公开(公告)号：US20080140385A1

公开(公告)日：2008-06-12

申请号：US11635153

申请日：2006-12-07

申请人： Milind Mahajan , Patrick Nguyen , Alejandro Acero

发明人： Milind Mahajan , Patrick Nguyen , Alejandro Acero

IPC分类号： G06F17/27 , G10L15/00

CPC分类号： G10L15/18 , G06F17/30787 , G06F17/30796 , G06F17/30852 , Y10S707/99945 , Y10S707/99948

摘要： Audio/video (A/V) content is analyzed using speech and language analysis components. Metadata is automatically generated based upon the analysis. The metadata is used in generating user interface interaction components which allow a user to view subject matter in various segments of the A/V content and to interact with the A/V content based on the automatically generated metadata.

摘要翻译： 使用语音和语言分析组件分析音频/视频（A / V）内容。基于分析自动生成元数据。元数据用于生成用户界面交互组件，其允许用户在A / V内容的各个段中查看主题，并且基于自动生成的元数据与A / V内容进行交互。

3.

发明授权
Hidden conditional random field models for phonetic classification and speech recognition 有权
标题翻译：用于语音分类和语音识别的隐藏条件随机场模型

公开(公告)号：US07627473B2

公开(公告)日：2009-12-01

申请号：US10966047

申请日：2004-10-15

申请人： Asela J. Gunawardana , Milind Mahajan , Alejandro Acero

发明人： Asela J. Gunawardana , Milind Mahajan , Alejandro Acero

IPC分类号： G10L17/00 , G10L15/14

CPC分类号： G10L15/14

摘要： A method and apparatus are provided for training and using a hidden conditional random field model for speech recognition and phonetic classification. The hidden conditional random field model uses feature functions, at least one of which is based on a hidden state in a phonetic unit. Values for the feature functions are determined from a segment of speech, and these values are used to identify a phonetic unit for the segment of speech.

摘要翻译： 提供了一种用于训练和使用用于语音识别和语音分类的隐藏条件随机场模型的方法和装置。隐藏条件随机场模型使用特征函数，其中至少一个基于语音单元中的隐藏状态。特征函数的值由语音段确定，并且这些值用于识别语音段的语音单元。

4.

发明授权
Language model adaptation using semantic supervision 失效
标题翻译：使用语义监督的语言模型适应

公开(公告)号：US07478038B2

公开(公告)日：2009-01-13

申请号：US10814906

申请日：2004-03-31

申请人： Ciprian Chelba , Milind Mahajan , Alejandro Acero , Yik-Cheung Tam

发明人： Ciprian Chelba , Milind Mahajan , Alejandro Acero , Yik-Cheung Tam

IPC分类号： G06F17/21

CPC分类号： G06F17/27 , G10L15/1815

摘要： A method and apparatus are provided for adapting a language model. The method and apparatus provide supervised class-based adaptation of the language model utilizing in-domain semantic information.

摘要翻译： 提供了一种适应语言模型的方法和装置。该方法和装置使用域内语义信息来提供对语言模型进行监督的基于类的适应。

5.

发明授权
Method and apparatus for predicting word error rates from text 有权

公开(公告)号：US07103544B2

公开(公告)日：2006-09-05

申请号：US11146324

申请日：2005-06-06

申请人： Milind Mahajan , Yonggang Deng , Alejandro Acero , Asela J. R. Gunawardana , Ciprian Chelba

发明人： Milind Mahajan , Yonggang Deng , Alejandro Acero , Asela J. R. Gunawardana , Ciprian Chelba

IPC分类号： G10L15/06 , G10L15/14

CPC分类号： G10L15/197 , G10L15/183

摘要： A method of modeling a speech recognition system includes decoding a speech signal produced from a training text to produce a sequence of predicted speech units. The training text comprises a sequence of actual speech units that is used with the sequence of predicted speech units to form a confusion model. In further embodiments, the confusion model is used to decode a text to identify an error rate that would be expected if the speech recognition system decoded speech based on the text.

6.

发明申请
Hidden conditional random field models for phonetic classification and speech recognition 有权
标题翻译：用于语音分类和语音识别的隐藏条件随机场模型

公开(公告)号：US20060085190A1

公开(公告)日：2006-04-20

申请号：US10966047

申请日：2004-10-15

申请人： Asela Gunawardana , Milind Mahajan , Alejandro Acero

发明人： Asela Gunawardana , Milind Mahajan , Alejandro Acero

IPC分类号： G10L15/14

CPC分类号： G10L15/14

摘要： A method and apparatus are provided for training and using a hidden conditional random field model for speech recognition and phonetic classification. The hidden conditional random field model uses features, at least one of which is based on a hidden state in a phonetic unit. Values for the features are determined from a segment of speech, and these values are used to identify a phonetic unit for the segment of speech.

摘要翻译： 提供了一种用于训练和使用用于语音识别和语音分类的隐藏条件随机场模型的方法和装置。隐藏的条件随机场模型使用特征，其中至少一个基于语音单元中的隐藏状态。特征的值由语音段确定，并且这些值用于识别语音段的语音单元。

7.

发明授权
Method for entering text 失效
标题翻译：输入文字的方法

公开(公告)号：US07363224B2

公开(公告)日：2008-04-22

申请号：US10748404

申请日：2003-12-30

申请人： Xuendong D. Huang , Alejandro Acero , Kuansan Wang , Milind Mahajan

发明人： Xuendong D. Huang , Alejandro Acero , Kuansan Wang , Milind Mahajan

IPC分类号： G10L15/04

CPC分类号： G06F3/0237 , G06F3/16 , G10L15/1815 , G10L15/22 , G10L2015/228

摘要： In a method of entering text into a device a first character input is provided that is indicative of a first character of a text entry. Next, a vocalization of the text entry is captured. A probable word candidate is then identified for a first word of the vocalization based upon the first character input and an analysis of the vocalization. Finally, the probable word candidate is displayed for a user.

摘要翻译： 在将文本输入到设备中的方法中，提供指示文本输入的第一个字符的第一个字符输入。接下来，捕获文本条目的发声。然后基于第一个字符输入和发声分析，识别发音的第一个单词的可能的词候选。最后，为用户显示可能的候选词。

8.

发明授权
Method and apparatus for predicting word error rates from text 有权
标题翻译：用于从文本中预测字错误率的方法和装置

公开(公告)号：US07117153B2

公开(公告)日：2006-10-03

申请号：US10365850

申请日：2003-02-13

申请人： Milind Mahajan , Yonggang Deng , Alejandro Acero , Asela J. R. Gunawardana , Ciprian Chelba

发明人： Milind Mahajan , Yonggang Deng , Alejandro Acero , Asela J. R. Gunawardana , Ciprian Chelba

IPC分类号： G10L15/06 , G10L15/14

CPC分类号： G10L15/197 , G10L15/183

摘要： A method of modeling a speech recognition system includes decoding a speech signal produced from a training text to produce a sequence of predicted speech units. The training text comprises a sequence of actual speech units that is used with the sequence of predicted speech units to form a confusion model. In further embodiments, the confusion model is used to decode a text to identify an error rate that would be expected if the speech recognition system decoded speech based on the text.

摘要翻译： 对语音识别系统进行建模的方法包括对从训练文本产生的语音信号进行解码以产生预测语音单元的序列。训练文本包括与预测语音单元的序列一起使用以形成混淆模型的实际语音单元的序列。在另外的实施例中，混淆模型用于对文本进行解码以识别如果语音识别系统基于文本解码的语音将会预期的错误率。

9.

发明授权
System and method for identifying semantic intent from acoustic information 有权
标题翻译：用于从声学信息中识别语义意图的系统和方法

公开(公告)号：US07634406B2

公开(公告)日：2009-12-15

申请号：US11009630

申请日：2004-12-10

申请人： Xiao Li , Asela J. Gunawardana , Alejandro Acero , Milind Mahajan , Dong Yu

发明人： Xiao Li , Asela J. Gunawardana , Alejandro Acero , Milind Mahajan , Dong Yu

IPC分类号： G10L15/06

CPC分类号： G10L15/19 , G10L15/1815

摘要： In accordance with one embodiment of the present invention, unanticipated semantic intents are discovered in audio data in an unsupervised manner. For instance, the audio acoustics are clustered based on semantic intent and representative acoustics are chosen for each cluster. The human then need only listen to a small number of representative acoustics for each cluster (and possibly only one per cluster) in order to identify the unforeseen semantic intents.

摘要翻译： 根据本发明的一个实施例，以无监督的方式在音频数据中发现意外的语义意图。例如，音频声学基于语义意图进行聚类，并为每个群集选择代表性的声学。然后，人们只需要听每个群集的少量代表性声学（并且可能只有一个群集），以便识别不可预见的语义意图。

10.

发明申请
DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR TEXT AND SPEECH CLASSIFICATION 有权
标题翻译：用于文本和语音分类的语言模式的歧视性培训

公开(公告)号：US20080215311A1

公开(公告)日：2008-09-04

申请号：US12103035

申请日：2008-04-15

申请人： Ciprian Chelba , Alejandro Acero , Milind Mahajan

发明人： Ciprian Chelba , Alejandro Acero , Milind Mahajan

IPC分类号： G06F17/27

CPC分类号： G06F17/2715 , G06F17/2818 , G10L15/183 , G10L15/197

摘要： Methods are disclosed for estimating language models such that the conditional likelihood of a class given a word string, which is very well correlated with classification accuracy, is maximized. The methods comprise tuning statistical language model parameters jointly for all classes such that a classifier discriminates between the correct class and the incorrect ones for a given training sentence or utterance. Specific embodiments of the present invention pertain to implementation of the rational function growth transform in the context of a discriminative training technique for n-gram classifiers.

摘要翻译： 公开了用于估计语言模型的方法，使得给定字串的类的条件似然性与分类准确度非常良好地相关联。这些方法包括对所有类共同调整统计语言模型参数，使得分类器在给定训练句或话语中区分正确类和不正确类之间的差异。本发明的具体实施例涉及在n-gram分类器的鉴别训练技术的上下文中实现有理函数增长变换。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类