专利检索 ap:("Harinath Garudadri" OR "Sunil Sivadas" OR "Hynek Hermansky" OR "Nelson H. Morgan" OR "Charles C. Wooters" OR "Andre Gustavo Adami" OR "Maria Carmen Benitez Ortuzar" OR "Lukas Burget" OR "Stephane N. Dupont" OR "Frantisek Grezl" OR "Pratibha Jain" OR "Sachin Kajarekar" OR "Petr Motlicek") AND inv:"Sachin Kajarekar" 第 1 页

1.

发明授权
Multistream network feature processing for a distributed speech recognition system 有权
标题翻译：用于分布式语音识别系统的多流网络特征处理

公开(公告)号：US07089178B2

公开(公告)日：2006-08-08

申请号：US10137633

申请日：2002-04-30

申请人： Harinath Garudadri , Sunil Sivadas , Hynek Hermansky , Nelson H. Morgan , Charles C. Wooters , Andre Gustavo Adami , Maria Carmen Benitez Ortuzar , Lukas Burget , Stephane N. Dupont , Frantisek Grezl , Pratibha Jain , Sachin Kajarekar , Petr Motlicek

发明人： Harinath Garudadri , Sunil Sivadas , Hynek Hermansky , Nelson H. Morgan , Charles C. Wooters , Andre Gustavo Adami , Maria Carmen Benitez Ortuzar , Lukas Burget , Stephane N. Dupont , Frantisek Grezl , Pratibha Jain , Sachin Kajarekar , Petr Motlicek

IPC分类号： G10L15/02 , G10L15/04 , G10L15/16

CPC分类号： G10L15/32 , G10L15/02 , G10L15/30

摘要： A distributed voice recognition system and method for obtaining acoustic features and speech activity at multiple frequencies by extracting high frequency components thereof on a device, such as a subscriber station and transmitting them to a network server having multiple stream processing capability, including cepstral feature processing, MLP nonlinear transformation processing, and multiband temporal pattern architecture processing. The features received at the network server are processed using all three streams, wherein each of the three streams provide benefits not available in the other two, thereby enhancing feature interpretation. Feature extraction and feature interpretation may operate at multiple frequencies, including but not limited to 8 kHz, 11 kHz, and 16 kHz.

摘要翻译： 一种用于通过在诸如用户站的设备上提取其高频分量并将其发送到具有多个流处理能力的网络服务器（包括倒谱特征处理）来获得多个频率的声学特征和语音活动的分布式语音识别系统和方法， MLP非线性变换处理，以及多段时间模式架构处理。使用所有三个流处理在网络服务器处接收的特征，其中三个流中的每个流提供在其他两个流中不可用的优点，从而增强特征解释。特征提取和特征解释可以在多个频率下操作，包括但不限于8kHz，11kHz和16kHz。

2.

发明授权
System and method for distributing meeting recordings in a network environment 有权
标题翻译：在网络环境中分发会议录音的系统和方法

公开(公告)号：US08902274B2

公开(公告)日：2014-12-02

申请号：US13693848

申请日：2012-12-04

申请人： Ashutosh A. Malegaonkar , Paul Quinn , Sachin Kajarekar

发明人： Ashutosh A. Malegaonkar , Paul Quinn , Sachin Kajarekar

IPC分类号： H04N7/14

CPC分类号： H04N7/147 , H04L67/22 , H04N7/155

摘要： A method is provided and includes discovering active participants and passive participants from a meeting recording, generating an active notification that includes an option to manipulate the meeting recording, and a passive notification without the option to manipulate the meeting recording, and sending the active notification and the passive notification to the active participants and the passive participants, respectively. The method can also include discovering followers from the meeting recording, generating a followers notification without the option to manipulate the meeting recording, and which includes access to a portion of meeting recording, and sending the followers notification to the followers. Discovering the active participants and the passive participants includes running speaker segmentation and recognition algorithms on the meeting recording, discovering attendees including speakers and non-speakers, and categorizing the speakers as the active participants, and the non-speakers as the passive participants.

摘要翻译： 提供了一种方法，包括从会议记录中发现主动参与者和被动参与者，生成包括操纵会议记录的选项的活动通知，以及无需操纵会议记录的被动通知，以及发送主动通知和分别向主动参与者和被动参与者的被动通知。该方法还可以包括从会议记录中发现追随者，生成跟随者通知，而无需操纵会议记录，并且包括访问会议记录的一部分，以及向跟随者发送关注者通知。发现积极参与者和被动参与者包括在会议记录上运行演讲者分割和识别算法，发现包括演讲者和非演讲者在内的与会者，并将演讲者分为主动参与者，非演讲者作为被动参与者。

3.

发明申请
SYSTEM AND METHOD FOR IMPROVING SPEAKER SEGMENTATION AND RECOGNITION ACCURACY IN A MEDIA PROCESSING ENVIRONMENT 有权
标题翻译：用于提高媒体处理环境中的扬声器分类和识别精度的系统和方法

公开(公告)号：US20140074471A1

公开(公告)日：2014-03-13

申请号：US13608420

申请日：2012-09-10

申请人： Ananth Sankar , Sachin Kajarekar , Satish K. Gannu

发明人： Ananth Sankar , Sachin Kajarekar , Satish K. Gannu

IPC分类号： G10L17/00

CPC分类号： G10L17/02

摘要： A method is provided and includes estimating an approximate list of potential speakers in a file from one or more applications. The file (e.g., an audio file, video file, or any suitable combination thereof) includes a recording of a plurality of speakers. The method also includes segmenting the file according to the approximate list of potential speakers such that each segment corresponds to at least one speaker; and recognizing particular speakers in the file based on the approximate list of potential speakers.

摘要翻译： 提供了一种方法，并且包括从一个或多个应用估计文件中的潜在扬声器的近似列表。文件（例如，音频文件，视频文件或其任何合适的组合）包括多个扬声器的记录。该方法还包括根据潜在扬声器的近似列表来分割文件，使得每个段对应于至少一个扬声器; 并根据潜在发言人的大致名单，识别文件中的特定发言人。

4.

发明申请
METHOD AND APPARATUS FOR DISCOVERING AND LABELING SPEAKERS IN A LARGE AND GROWING COLLECTION OF VIDEOS WITH MINIMAL USER EFFORT 审中-公开
标题翻译：用于发现和标示演讲者的方法和装置，并以最小的用户体验收集视频

公开(公告)号：US20130144414A1

公开(公告)日：2013-06-06

申请号：US13312800

申请日：2011-12-06

申请人： Sachin Kajarekar , Ananth Sankar , Sattish Gannu , Aparna Khare

发明人： Sachin Kajarekar , Ananth Sankar , Sattish Gannu , Aparna Khare

IPC分类号： G06F17/00

CPC分类号： G10L17/02

摘要： In one embodiment, an audio stream is partitioned into a plurality of segments such that the plurality of segments are clustered into one or more clusters, each of the one or more clusters identifying a subset of the plurality of segments in the audio stream and corresponding to one of a first set of one or more speaker models, each speaker model in the first set of speaker models representing one of a first set of hypothetical speakers. The speaker models in the first set of speaker models are compared with a second set of one or more speaker models, where each speaker model in the second set of speaker models represents one of a second set of hypothetical speakers. Labels associated with one or more speaker models in the second set of speaker models are propagated to one or more speaker models in the first set of speaker models according to a result of the comparing step.

摘要翻译： 在一个实施例中，音频流被划分成多个段，使得多个段被聚集成一个或多个簇，一个或多个簇中的每一个标识音频流中的多个段的子集，并对应于一个或多个扬声器模型的第一组中的一个，第一组扬声器模型中的每个扬声器模型代表第一组假想扬声器之一。将第一组扬声器模型中的扬声器模型与第二组一个或多个扬声器模型进行比较，其中第二组扬声器模型中的每个扬声器模型表示第二组假想扬声器中的一个。根据比较步骤的结果，在第二组扬声器模型中与一个或多个扬声器模型相关联的标签被传播到第一组扬声器模型中的一个或多个扬声器模型。

5.

发明申请
SYSTEM AND METHOD FOR JOINT SPEAKER AND SCENE RECOGNITION IN A VIDEO/AUDIO PROCESSING ENVIRONMENT 审中-公开
标题翻译：视频/音频处理环境中的联合扬声器和场景识别的系统和方法

公开(公告)号：US20130300939A1

公开(公告)日：2013-11-14

申请号：US13469886

申请日：2012-05-11

申请人： Jim Chen Chou , Sachin Kajarekar , Jason J. Catchpole , Ananth Sankar

发明人： Jim Chen Chou , Sachin Kajarekar , Jason J. Catchpole , Ananth Sankar

IPC分类号： H04N5/14

CPC分类号： G06K9/00765 , G06K9/6297 , H04N7/147

摘要： An example method is provided and includes receiving a media file that includes video data and audio data; determining an initial scene sequence in the media file; determining an initial speaker sequence in the media file; and updating a selected one of the initial scene sequence and the initial speaker sequence in order to generate an updated scene sequence and an updated speaker sequence respectively. The initial scene sequence is updated based on the initial speaker sequence, and wherein the initial speaker sequence is updated based on the initial scene sequence.

摘要翻译： 提供了一种示例性方法，并且包括接收包括视频数据和音频数据的媒体文件; 确定媒体文件中的初始场景序列; 确定所述媒体文件中的初始说话者序列; 以及更新所述初始场景序列和所述初始说话者序列中的所选择的一个，以便分别生成更新的场景序列和更新的说话者序列。基于初始说话者序列更新初始场景序列，并且其中基于初始场景序列更新初始说话者序列。

6.

发明申请
METHOD AND APPARATUS FOR SPEAKER RECOGNITION 审中-公开
标题翻译：用于语音识别的方法和装置

公开(公告)号：US20080010065A1

公开(公告)日：2008-01-10

申请号：US11758650

申请日：2007-06-05

申请人： Harry BRATT , Luciana Ferrer , Martin Graciarena , Sachin Kajarekar , Elizabeth Shriberg , Mustafa Sonmez , Andreas Stolcke , Gokhan Tur , Anand Venkataraman

发明人： Harry BRATT , Luciana Ferrer , Martin Graciarena , Sachin Kajarekar , Elizabeth Shriberg , Mustafa Sonmez , Andreas Stolcke , Gokhan Tur , Anand Venkataraman

IPC分类号： G10L17/00

CPC分类号： G06K9/6222 , G10L17/10

摘要： A method and apparatus for speaker recognition is provided. One embodiment of a method for determining whether a given speech signal is produced by an alleged speaker, where a plurality of statistical models (including at least one support vector machine) have been produced for the alleged speaker based on a previous speech signal received from the alleged speaker, includes receiving the given speech signal, the speech signal representing an utterance made by a speaker claiming to be the alleged speaker, scoring the given speech signal using at least two modeling systems, where at least one of the modeling systems is a support vector machine, combining scores produced by the modeling systems, with equal weights, to produce a final score, and determining, in accordance with the final score, whether the speaker is likely the alleged speaker.

摘要翻译： 提供了一种用于说话者识别的方法和装置。一种用于确定给定语音信号是否由所指示的说话者产生的方法的一个实施例，其中已经根据从所述语音信号接收到的先前语音信号为所述说话者产生了多个统计模型（包括至少一个支持向量机）所述扬声器包括接收给定的语音信号，所述语音信号表示由声称是所述说话者的扬声器发出的发声，使用至少两个建模系统对所述给定语音信号进行评分，其中所述建模系统中的至少一个是支持向量机，组合由建模系统产生的得分，具有相等的权重，以产生最终得分，并且根据最终得分确定说话人是否可能是被指称的说话者。

7.

发明申请
SYSTEM AND METHOD FOR DISTRIBUTING MEETING RECORDINGS IN A NETWORK ENVIRONMENT 有权
标题翻译：在网络环境中分配会议记录的系统和方法

公开(公告)号：US20140152757A1

公开(公告)日：2014-06-05

申请号：US13693848

申请日：2012-12-04

申请人： Ashutosh A. Malegaonkar , Paul Quinn , Sachin Kajarekar

发明人： Ashutosh A. Malegaonkar , Paul Quinn , Sachin Kajarekar

IPC分类号： H04N7/14

CPC分类号： H04N7/147 , H04L67/22 , H04N7/155

摘要： A method is provided and includes discovering active participants and passive participants from a meeting recording, generating an active notification that includes an option to manipulate the meeting recording, and a passive notification without the option to manipulate the meeting recording, and sending the active notification and the passive notification to the active participants and the passive participants, respectively. The method can also include discovering followers from the meeting recording, generating a followers notification without the option to manipulate the meeting recording, and which includes access to a portion of meeting recording, and sending the followers notification to the followers. Discovering the active participants and the passive participants includes running speaker segmentation and recognition algorithms on the meeting recording, discovering attendees including speakers and non-speakers, and categorizing the speakers as the active participants, and the non-speakers as the passive participants.

摘要翻译： 提供了一种方法，包括从会议记录中发现主动参与者和被动参与者，生成包括操纵会议记录的选项的活动通知，以及无需操纵会议记录的被动通知，以及发送主动通知和分别向主动参与者和被动参与者的被动通知。该方法还可以包括从会议记录中发现追随者，生成跟随者通知，而无需操纵会议记录，并且包括访问会议记录的一部分，以及向跟随者发送关注者通知。发现积极参与者和被动参与者包括在会议记录上运行演讲者分割和识别算法，发现包括演讲者和非演讲者在内的与会者，并将演讲者分为主动参与者，非演讲者作为被动参与者。

8.

发明授权
Speaker segmentation and recognition based on list of speakers 有权
标题翻译：扬声器分割和识别基于扬声器列表

公开(公告)号：US09058806B2

公开(公告)日：2015-06-16

申请号：US13608420

申请日：2012-09-10

申请人： Ananth Sankar , Sachin Kajarekar , Satish K. Gannu

发明人： Ananth Sankar , Sachin Kajarekar , Satish K. Gannu

IPC分类号： G10L17/06 , G10L17/22 , G10L17/02

CPC分类号： G10L17/02

摘要： A method is provided and includes estimating an approximate list of potential speakers in a file from one or more applications. The file (e.g., an audio file, video file, or any suitable combination thereof) includes a recording of a plurality of speakers. The method also includes segmenting the file according to the approximate list of potential speakers such that each segment corresponds to at least one speaker; and recognizing particular speakers in the file based on the approximate list of potential speakers.

摘要翻译： 提供了一种方法，并且包括从一个或多个应用估计文件中的潜在扬声器的近似列表。文件（例如，音频文件，视频文件或其任何合适的组合）包括多个扬声器的记录。该方法还包括根据潜在扬声器的近似列表来分割文件，使得每个段对应于至少一个扬声器; 并根据潜在发言人的大致名单，识别文件中的特定发言人。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类