Method of active learning for automatic speech recognition
    1.
    发明授权
    Method of active learning for automatic speech recognition 有权
    自动语音识别主动学习方法

    公开(公告)号:US08990084B2

    公开(公告)日:2015-03-24

    申请号:US14176439

    申请日:2014-02-10

    CPC classification number: G10L15/063

    Abstract: State-of-the-art speech recognition systems are trained using transcribed utterances, preparation of which is labor-intensive and time-consuming. The present invention is an iterative method for reducing the transcription effort for training in automatic speech recognition (ASR). Active learning aims at reducing the number of training examples to be labeled by automatically processing the unlabeled examples and then selecting the most informative ones with respect to a given cost function for a human to label. The method comprises automatically estimating a confidence score for each word of the utterance and exploiting the lattice output of a speech recognizer, which was trained on a small set of transcribed data. An utterance confidence score is computed based on these word confidence scores; then the utterances are selectively sampled to be transcribed using the utterance confidence scores.

    Abstract translation: 最先进的语音识别系统是使用转录语言进行训练,其准备是劳动密集型和耗时的。 本发明是用于减少自动语音识别(ASR)中训练的转录努力的迭代方法。 主动学习旨在通过自动处理未标记的示例,然后针对人类给定的成本函数选择最具信息的示例来减少要标注的培训示例的数量。 该方法包括自动估计每个词语的置信度,并利用在一小组转录数据上训练的语音识别器的格子输出。 基于这些单词置信度得出一个话语置信度得分; 然后使用话语置信度得分选择性地采样语音以进行转录。

Patent Agency Ranking