专利检索 ap:("Biing-Hwang Juang" OR "Chin-Hui Lee" OR "Aaron Edward Rosenberg" OR "Frank Kao-Ping Soong") AND inv:"Aaron Edward Rosenberg" 第 1 页

1.

发明授权
Speaker verification with cohort normalized scoring 失效
标题翻译：演讲者验证与队列归一化得分

公开(公告)号：US5675704A

公开(公告)日：1997-10-07

申请号：US638401

申请日：1996-04-26

申请人： Biing-Hwang Juang , Chin-Hui Lee , Aaron Edward Rosenberg , Frank Kao-Ping Soong

发明人： Biing-Hwang Juang , Chin-Hui Lee , Aaron Edward Rosenberg , Frank Kao-Ping Soong

IPC分类号： G10L15/10 , G10L15/00 , G10L15/14 , G10L17/00 , H04M3/38 , H04M3/42 , H04M15/00 , G10L5/06

CPC分类号： G10L17/00 , G10L17/12 , H04M3/382 , H04M3/42204 , G10L15/142 , H04M15/00 , H04M2201/40

摘要： A facility is provided for allowing a caller to place a telephone call by merely uttering a label identifying a desired called destination and to charge the telephone call to a particular billing account by merely uttering a label identifying that account. Alternatively, the caller may place the call by dialing or uttering the telephone number of the called destination or by entering a speed dial code associated with that telephone number. The facility includes a speaker verification system which employs cohort normalized scoring. Cohort normalized scoring provides a dynamic threshold for the verification process making the process more robust to variation in training and verification utterences. Such variation may be caused by, e.g., changes in communication channel characteristics or speaker loudness level.

摘要翻译： 提供了一种设施，用于允许呼叫者通过仅仅发出标识期望的被叫目的地的标签来进行电话呼叫，并通过仅仅发出标识该帐户的标签来将电话呼叫收费到特定的记帐帐户。或者，呼叫者可以通过拨打或说出被叫目的地的电话号码或通过输入与该电话号码相关联的快速拨号代码来进行呼叫。该设施包括使用队列归一化得分的扬声器验证系统。队列归一化得分为验证过程提供了动态门槛，使得过程对培训和验证发现的变化更加鲁棒。这种变化可以由例如通信信道特性或扬声器响度水平的变化引起。

2.

发明授权
System and method for providing a compensated speech recognition model for speech recognition 有权
标题翻译：用于提供用于语音识别的补偿语音识别模型的系统和方法

公开(公告)号：US07996220B2

公开(公告)日：2011-08-09

申请号：US12264700

申请日：2008-11-04

申请人： Richard C. Rose , Sarangarajan Pathasarathy , Aaron Edward Rosenberg , Shrikanth Sambasivan Narayanan

发明人： Richard C. Rose , Sarangarajan Pathasarathy , Aaron Edward Rosenberg , Shrikanth Sambasivan Narayanan

IPC分类号： G10L15/00

CPC分类号： G10L15/04 , G10L15/063 , G10L15/08 , G10L15/197 , G10L15/22 , G10L2015/0631

摘要： An automatic speech recognition (ASR) system and method is provided for controlling the recognition of speech utterances generated by an end user operating a communications device. The ASR system and method can be used with a communications device that is used in a communications network. The ASR system can be used for ASR of speech utterances input into a mobile device, to perform compensating techniques using at least one characteristic and for updating an ASR speech recognizer associated with the ASR system by determined and using a background noise value and a distortion value that is based on the features of the mobile device. The ASR system can be used to augment a limited data input capability of a mobile device, for example, caused by limited input devices physically located on the mobile device.

摘要翻译： 提供了一种自动语音识别（ASR）系统和方法，用于控制由操作通信设备的终端用户生成的语音话音的识别。 ASR系统和方法可以与在通信网络中使用的通信设备一起使用。 ASR系统可以用于输入到移动设备的语音话语的ASR，使用至少一个特征来执行补偿技术，并通过确定并使用背景噪声值和失真值来更新与ASR系统相关联的ASR语音识别器这是基于移动设备的功能。 ASR系统可用于增加移动设备的有限数据输入能力，例如由物理上位于移动设备上的有限输入设备引起的。

3.

发明授权
Unsupervised speaker segmentation of multi-speaker speech data 有权
标题翻译：多扬声器语音数据的无监督扬声器分割

公开(公告)号：US07930179B1

公开(公告)日：2011-04-19

申请号：US11866125

申请日：2007-10-02

申请人： Allen Louis Gorin , Zhu Liu , Sarangarajan Parthasarathy , Aaron Edward Rosenberg

发明人： Allen Louis Gorin , Zhu Liu , Sarangarajan Parthasarathy , Aaron Edward Rosenberg

IPC分类号： G10L17/00

CPC分类号： G10L17/12

摘要： Systems and methods for unsupervised segmentation of multi-speaker speech or audio data by speaker. A front-end analysis is applied to input speech data to obtain feature vectors. The speech data is initially segmented and then clustered into groups of segments that correspond to different speakers. The clusters are iteratively modeled and resegmented to obtain stable speaker segmentations. The overlap between segmentation sets is checked to ensure successful speaker segmentation. Overlapping segments are combined and remodeled and resegmented. Optionally, the speech data is processed to produce a segmentation lattice to maximize the overall segmentation likelihood.

摘要翻译： 用于扬声器的多扬声器语音或音频数据的无监督分割的系统和方法。应用前端分析来输入语音数据以获得特征向量。语音数据最初被分段，然后被聚集成对应于不同说话者的段的组。这些簇被迭代地建模和重新分段以获得稳定的扬声器分割。检查分割集之间的重叠以确保成功的说话者分割。重叠片段被组合并重新构建并重新分段。可选地，语音数据被处理以产生分割格子以最大化整体分割似然。

4.

发明授权
System and method of performing speech recognition based on a user identifier 有权
标题翻译：基于用户标识符执行语音识别的系统和方法

公开(公告)号：US07451081B1

公开(公告)日：2008-11-11

申请号：US11685456

申请日：2007-03-13

申请人： Bojana Gajic , Shrikanth Sambasivan Narayanan , Sarangarajan Parthasarathy , Richard Cameron Rose , Aaron Edward Rosenberg

发明人： Bojana Gajic , Shrikanth Sambasivan Narayanan , Sarangarajan Parthasarathy , Richard Cameron Rose , Aaron Edward Rosenberg

IPC分类号： G10L15/00

CPC分类号： G10L15/07 , G10L15/20

摘要： Speech recognition models are dynamically re-configurable based on user information, application information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. Word recognition lattices are generated for each data field of an application and dynamically concatenated into a single word recognition lattice. A language model is applied to the concatenated word recognition lattice to determine the relationships between the word recognition lattices and repeated until the generated word recognition lattices are acceptable or differ from a predetermined value only by a threshold amount. These techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition.

摘要翻译： 语音识别模型基于用户信息，应用信息，背景噪声等背景信息和传感器响应特性等传感器信息进行动态重新配置，为用户提供键盘文本输入的备用输入模式。为应用程序的每个数据字段生成字识别网格，并动态连接到单个字识别网格中。将语言模型应用于级联的单词识别格以确定单词识别格子之间的关系并重复，直到生成的单词识别格子可接受或仅与预定值不同，仅通过阈值量。这些动态可重配置语音识别技术提供了诸如移动电话和个人数字助理以及诸如办公室，家庭或车辆等环境的语音识别部署，同时保持语音识别的准确性。

5.

发明授权
System and method for indexing voice mail messages by speaker 有权
标题翻译：通过扬声器索引语音邮件的系统和方法

公开(公告)号：US08346539B2

公开(公告)日：2013-01-01

申请号：US12648909

申请日：2009-12-29

申请人： Julia Hirschberg , Sarangarajan Parthasarathy , Aaron Edward Rosenberg , Stephen Whittaker

发明人： Julia Hirschberg , Sarangarajan Parthasarathy , Aaron Edward Rosenberg , Stephen Whittaker

IPC分类号： G06F17/27

CPC分类号： H04M3/533 , G10L17/00 , G10L17/04

摘要： The invention provides a system and method for indexing and organizing voice mail message by the speaker of the message. One or more speaker models are created from voice mail messages received. As additional messages are left, each of the new messages are compared with existing speaker models to determine the identity of the callers of each of the new messages. The voice mail messages are organized within a user's mailbox by caller. Unknown callers may be identified and tagged by the user and then used to create new speaker models and/or update existing speaker models.

摘要翻译： 本发明提供了一种用于由消息的说话者索引和组织语音邮件消息的系统和方法。从接收到的语音邮件消息创建一个或多个扬声器模型。随着附加的消息被留下，每个新消息与现有的说话者模型进行比较，以确定每个新消息的呼叫者的身份。语音邮件消息由呼叫者组织在用户的邮箱内。未知的呼叫者可能被用户识别和标记，然后用于创建新的扬声器模型和/或更新现有的扬声器模型。

6.

发明申请
SYSTEM AND METHOD FOR PROVIDING A COMPENSATED SPEECH RECOGNITION MODEL FOR SPEECH RECOGNITION 有权
标题翻译：用于提供用于语音识别的补偿语音识别模型的系统和方法

公开(公告)号：US20090063144A1

公开(公告)日：2009-03-05

申请号：US12264700

申请日：2008-11-04

申请人： Richard C. Rose , Sarangarajan Pathasarathy , Aaron Edward Rosenberg , Shrikanth Sambasivan Narayanan

发明人： Richard C. Rose , Sarangarajan Pathasarathy , Aaron Edward Rosenberg , Shrikanth Sambasivan Narayanan

IPC分类号： G10L15/00

CPC分类号： G10L15/04 , G10L15/063 , G10L15/08 , G10L15/197 , G10L15/22 , G10L2015/0631

摘要： An automatic speech recognition (ASR) system and method is provided for controlling the recognition of speech utterances generated by an end user operating a communications device. The ASR system and method can be used with a communications device that is used in a communications network. The ASR system can be used for ASR of speech utterances input into a mobile device, to perform compensating techniques using at least one characteristic and for updating an ASR speech recognizer associated with the ASR system by determined and using a background noise value and a distortion value that is based on the features of the mobile device. The ASR system can be used to augment a limited data input capability of a mobile device, for example, caused by limited input devices physically located on the mobile device.

摘要翻译： 提供了一种自动语音识别（ASR）系统和方法，用于控制由操作通信设备的终端用户生成的语音话音的识别。 ASR系统和方法可以与在通信网络中使用的通信设备一起使用。 ASR系统可以用于输入到移动设备的语音话语的ASR，使用至少一个特征来执行补偿技术，并通过确定并使用背景噪声值和失真值来更新与ASR系统相关联的ASR语音识别器这是基于移动设备的功能。 ASR系统可用于增加移动设备的有限数据输入能力，例如由物理上位于移动设备上的有限输入设备引起的。

7.

发明授权
Unsupervised speaker segmentation of multi-speaker speech data 有权
标题翻译：多扬声器语音数据的无监督扬声器分割

公开(公告)号：US07295970B1

公开(公告)日：2007-11-13

申请号：US10350727

申请日：2003-01-24

申请人： Allen Louis Gorin , Zhu Liu , Sarangarajan Parthasarathy , Aaron Edward Rosenberg

发明人： Allen Louis Gorin , Zhu Liu , Sarangarajan Parthasarathy , Aaron Edward Rosenberg

IPC分类号： G10L19/12

CPC分类号： G10L17/12

摘要： Systems and methods for unsupervised segmentation of multi-speaker speech or audio data by speaker. A front-end analysis is applied to input speech data to obtain feature vectors. The speech data is initially segmented and then clustered into groups of segments that correspond to different speakers. The clusters are iteratively modeled and resegmented to obtain stable speaker segmentations. The overlap between segmentation sets is checked to ensure successful speaker segmentation. Overlapping segments are combined and remodeled and resegmented. Optionally, the speech data is processed to produce a segmentation lattice to maximize the overall segmentation likelihood.

摘要翻译： 用于扬声器的多扬声器语音或音频数据的无监督分割的系统和方法。应用前端分析来输入语音数据以获得特征向量。语音数据最初被分段，然后被聚集成对应于不同说话者的段的组。这些簇被迭代地建模和重新分段以获得稳定的扬声器分割。检查分割集之间的重叠以确保成功的说话者分割。重叠片段被组合并重新构建并重新分段。可选地，语音数据被处理以产生分割格子以最大化整体分割似然。

8.

发明申请
SYSTEM AND METHOD OF PERFORMING USER-SPECIFIC AUTOMATIC SPEECH RECOGNITION 有权
标题翻译：执行用户特定自动语音识别的系统和方法

公开(公告)号：US20120185237A1

公开(公告)日：2012-07-19

申请号：US13429946

申请日：2012-03-26

申请人： Bojana GAJIC , Shrikanth Sambasivan Narayanan , Sarangarajan Parthasarathy , Richard Cameron Rose , Aaron Edward Rosenberg

发明人： Bojana GAJIC , Shrikanth Sambasivan Narayanan , Sarangarajan Parthasarathy , Richard Cameron Rose , Aaron Edward Rosenberg

IPC分类号： G06F17/20 , G10L17/00

CPC分类号： G10L15/07 , G10L15/20

摘要： Speech recognition models are dynamically re-configurable based on user information, application information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. Word recognition lattices are generated for each data field of an application and dynamically concatenated into a single word recognition lattice. A language model is applied to the concatenated word recognition lattice to determine the relationships between the word recognition lattices and repeated until the generated word recognition lattices are acceptable or differ from a predetermined value only by a threshold amount. These techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition.

摘要翻译： 语音识别模型基于用户信息，应用信息，背景噪声等背景信息和传感器响应特性等传感器信息进行动态重新配置，为用户提供键盘文本输入的备用输入模式。为应用程序的每个数据字段生成字识别网格，并动态连接到单个字识别网格中。将语言模型应用于级联的单词识别格以确定单词识别格子之间的关系并重复，直到生成的单词识别格子可接受或仅与预定值不同，仅通过阈值量。这些动态可重配置语音识别技术提供了诸如移动电话和个人数字助理以及诸如办公室，家庭或车辆等环境的语音识别部署，同时保持语音识别的准确性。

9.

发明授权
System and method for automated multimedia content indexing and retrieval 有权
标题翻译：用于自动多媒体内容索引和检索的系统和方法

公开(公告)号：US08131552B1

公开(公告)日：2012-03-06

申请号：US11623955

申请日：2007-01-17

申请人： David Crawford Gibbon , Qian Huang , Zhu Liu , Aaron Edward Rosenberg , Behzad Shahraray

发明人： David Crawford Gibbon , Qian Huang , Zhu Liu , Aaron Edward Rosenberg , Behzad Shahraray

IPC分类号： G06F19/26 , G10L17/00

CPC分类号： G06F17/30787 , G06F17/30796 , G06F17/30843 , G10L17/00 , Y10S707/99933 , Y10S707/99943

摘要： The invention provides a system and method for automatically indexing and retrieving multimedia content. The method may include separating a multimedia data stream into audio, visual and text components, segmenting the audio, visual and text components based on semantic differences, identifying at least one target speaker using the audio and visual components, identifying a topic of the multimedia event using the segmented text and topic category models, generating a summary of the multimedia event based on the audio, visual and text components, the identified topic and the identified target speaker, and generating a multimedia description of the multimedia event based on the identified target speaker, the identified topic, and the generated summary.

摘要翻译： 本发明提供一种用于自动索引和检索多媒体内容的系统和方法。该方法可以包括将多媒体数据流分离成音频，视觉和文本组件，基于语义差异分割音频，视觉和文本组件，使用音频和视觉组件识别至少一个目标扬声器，识别多媒体事件的主题使用分段文本和主题类别模型，基于音频，视觉和文本组件，所识别的主题和所识别的目标扬声器生成多媒体事件的摘要，以及基于所识别的目标扬声器生成多媒体事件的多媒体描述，识别的主题和生成的摘要。

10.

发明授权
System and method for indexing voice mail messages by speaker 有权
标题翻译：通过扬声器索引语音邮件的系统和方法

公开(公告)号：US07664636B1

公开(公告)日：2010-02-16

申请号：US09550686

申请日：2000-04-17

申请人： Julia Hirschberg , Sarangarajan Parthasarathy , Aaron Edward Rosenberg , Stephen Whittaker

发明人： Julia Hirschberg , Sarangarajan Parthasarathy , Aaron Edward Rosenberg , Stephen Whittaker

IPC分类号： G10L15/00

CPC分类号： H04M3/533 , G10L17/00 , G10L17/04

摘要： The invention provides a system and method for indexing and organizing voice mail message by the speaker of the message. One or more speaker models are created from voice mail messages received. As additional messages are left, each of the new messages are compared with existing speaker models to determine the identity of the callers of each of the new messages. The voice mail messages are organized within a user's mailbox by caller. Unknown callers may be identified and tagged by the user and then used to create new speaker models and/or update existing speaker models.

摘要翻译： 本发明提供了一种用于由消息的说话者索引和组织语音邮件消息的系统和方法。从接收到的语音邮件消息创建一个或多个扬声器模型。随着附加的消息被留下，每个新消息与现有的说话者模型进行比较，以确定每个新消息的呼叫者的身份。语音邮件消息由呼叫者组织在用户的邮箱内。未知的呼叫者可能被用户识别和标记，然后用于创建新的扬声器模型和/或更新现有的扬声器模型。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类