Automatic collection of speaker name pronunciations
    1.
    发明授权
    Automatic collection of speaker name pronunciations 有权
    自动收集扬声器名称发音

    公开(公告)号:US09240181B2

    公开(公告)日:2016-01-19

    申请号:US13970850

    申请日:2013-08-20

    Abstract: An audio stream is segmented into a plurality of time segments using speaker segmentation and recognition (SSR), with each time segment corresponding to the speaker's name, producing an SSR transcript. The audio stream is transcribed into a plurality of word regions using automatic speech recognition (ASR), with each of the word regions having a measure of the confidence in the accuracy of the translation, producing an ASR transcript. Word regions with a relatively low confidence in the accuracy of the translation are identified. The low confidence regions are filtered using named entity recognition (NER) rules to identify low confidence regions that a likely names. The NER rules associate a region that is identified as a likely name with the name of the speaker corresponding to the current, the previous, or the next time segment. All of the likely name regions associated with that speaker's name are selected.

    Abstract translation: 使用说话者分割和识别(SSR)将音频流分割成多个时间段,每个时间段对应于说话人的姓名,产生SSR记录。 使用自动语音识别(ASR)将音频流转录成多个单词区域,每个单词区域具有对翻译精度的置信度的度量,产生ASR记录。 确定了对翻译准确性相对较低置信度的词区域。 使用命名实体识别(NER)规则过滤低置信区域以识别可能名称的低置信区域。 NER规则将被识别为可能的名称的区域与与当前的,先前的或下一个时间段相对应的说话者的名称相关联。 选择与该扬声器名称相关联的所有可能的名称区域。

    Automatic Collection of Speaker Name Pronunciations
    2.
    发明申请
    Automatic Collection of Speaker Name Pronunciations 有权
    自动收集扬声器名称发音

    公开(公告)号:US20150058005A1

    公开(公告)日:2015-02-26

    申请号:US13970850

    申请日:2013-08-20

    Abstract: An audio stream is segmented into a plurality of time segments using speaker segmentation and recognition (SSR), with each time segment corresponding to the speaker's name, producing an SSR transcript. The audio stream is transcribed into a plurality of word regions using automatic speech recognition (ASR), with each of the word regions having a measure of the confidence in the accuracy of the translation, producing an ASR transcript. Word regions with a relatively low confidence in the accuracy of the translation are identified. The low confidence regions are filtered using named entity recognition (NER) rules to identify low confidence regions that a likely names. The NER rules associate a region that is identified as a likely name with the name of the speaker corresponding to the current, the previous, or the next time segment. All of the likely name regions associated with that speaker's name are selected.

    Abstract translation: 使用说话者分割和识别(SSR)将音频流分割成多个时间段,每个时间段对应于说话人的姓名,产生SSR记录。 使用自动语音识别(ASR)将音频流转录成多个单词区域,每个单词区域具有对翻译精度的置信度的度量,产生ASR记录。 确定了对翻译准确性相对较低置信度的词区域。 使用命名实体识别(NER)规则过滤低置信区域以识别可能名称的低置信区域。 NER规则将被识别为可能的名称的区域与与当前的,先前的或下一个时间段相对应的说话者的名称相关联。 选择与该扬声器名称相关联的所有可能的名称区域。

Patent Agency Ranking