Automatic collection of speaker name pronunciations

Invention Grant

US09240181B2 Automatic collection of speaker name pronunciations 有权

Title translation: 自动收集扬声器名称发音

Please log in to see more content

Patent Title: Automatic collection of speaker name pronunciations
Patent Title (中): 自动收集扬声器名称发音
Application No.: US13970850

Application Date: 2013-08-20
Publication No.: US09240181B2

Publication Date: 2016-01-19
Inventor: Aparna Khare , Neha Agrawal , Sachin S. Kajarekar , Matthias Paulik
Applicant: Cisco Technology, Inc.
Applicant Address: US CA San Jose
Assignee: Cisco Technology, Inc.
Current Assignee: Cisco Technology, Inc.
Current Assignee Address: US CA San Jose
Agency: Edell, Shapiro & Finnan, LLC
Main IPC: G10L15/00
IPC: G10L15/00 ; G10L15/26 ; G10L15/06 ; G10L15/187 ; G10L15/04 ; G10L17/00 ; G10L15/02

Automatic collection of speaker name pronunciations

Abstract:

An audio stream is segmented into a plurality of time segments using speaker segmentation and recognition (SSR), with each time segment corresponding to the speaker's name, producing an SSR transcript. The audio stream is transcribed into a plurality of word regions using automatic speech recognition (ASR), with each of the word regions having a measure of the confidence in the accuracy of the translation, producing an ASR transcript. Word regions with a relatively low confidence in the accuracy of the translation are identified. The low confidence regions are filtered using named entity recognition (NER) rules to identify low confidence regions that a likely names. The NER rules associate a region that is identified as a likely name with the name of the speaker corresponding to the current, the previous, or the next time segment. All of the likely name regions associated with that speaker's name are selected.

Abstract(Chinese):

使用说话者分割和识别（SSR）将音频流分割成多个时间段，每个时间段对应于说话人的姓名，产生SSR记录。使用自动语音识别（ASR）将音频流转录成多个单词区域，每个单词区域具有对翻译精度的置信度的度量，产生ASR记录。确定了对翻译准确性相对较低置信度的词区域。使用命名实体识别（NER）规则过滤低置信区域以识别可能名称的低置信区域。 NER规则将被识别为可能的名称的区域与与当前的，先前的或下一个时间段相对应的说话者的名称相关联。选择与该扬声器名称相关联的所有可能的名称区域。

Public/Granted literature

US20150058005A1 Automatic Collection of Speaker Name Pronunciations Public/Granted day:2015-02-26

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）