METHOD AND SYSTEM FOR REAL-TIME SPEECH RECOGNITION
    5.
    发明公开
    METHOD AND SYSTEM FOR REAL-TIME SPEECH RECOGNITION 有权
    方法和系统的实时语音识别

    公开(公告)号:EP1449203A1

    公开(公告)日:2004-08-25

    申请号:EP02801823.2

    申请日:2002-10-22

    申请人: DSPFactory Ltd.

    IPC分类号: G10L15/28

    CPC分类号: G10L15/34

    摘要: Method and system for real-time speech recognition is provided. The speech algorithm runs on a platform having an input-output processor and a plurality of processor units. The processor units operate substantially in parallel or sequentially to perform feature extraction and pattern matching. While the input-output processor creates a frame, the processor units execute the feature extraction and the pattern matching. Shared memory is provided for supporting the parallel operation.

    ONLINE MAXIMUM-LIKELIHOOD MEAN AND VARIANCE NORMALIZATION FOR SPEECH RECOGNITION
    7.
    发明公开
    ONLINE MAXIMUM-LIKELIHOOD MEAN AND VARIANCE NORMALIZATION FOR SPEECH RECOGNITION 审中-公开
    在线 - 最大的利比里亚 - 德克萨斯州在VARIANZNORMALISIERUNGFÜRSPRACHERKENNUNG

    公开(公告)号:EP2903003A1

    公开(公告)日:2015-08-05

    申请号:EP15160261.2

    申请日:2010-02-22

    发明人: Willett, David

    摘要: A feature transform for speech recognition is described. An input speech utterance is processed to produce a sequence of representative speech vectors. A single time-synchronous speech recognition pass is performed using a decoding search to determine a recognition output corresponding to the speech input. The decoding search includes, for each speech vector after some first threshold number of speech vectors, estimating a feature transform based on the preceding speech vectors in the utterance and partial decoding results of the decoding search. The current speech vector is then adjusted based on the current feature transform, and the adjusted speech vector is used in a current frame of the decoding search.

    摘要翻译: 描述用于语音识别的特征变换。 处理输入语音话语以产生代表性语音向量的序列。 使用解码搜索来执行单个时间同步语音识别遍,以确定对应于语音输入的识别输出。 解码搜索包括对于在一些第一阈值数量的语音矢量之后的每个语音向量,基于解码搜索的发声和部分解码结果中的先前语音向量来估计特征变换。 然后基于当前特征变换来调整当前语音矢量,并且在解码搜索的当前帧中使用经调整的语音矢量。

    Speech recognition using variable-length context
    8.
    发明公开
    Speech recognition using variable-length context 审中-公开
    语音识别装置的上下文中的可变长度的

    公开(公告)号:EP2851895A3

    公开(公告)日:2015-05-06

    申请号:EP14197702.5

    申请日:2012-06-29

    申请人: Google Inc.

    IPC分类号: G10L15/18 G10L15/06

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech using a variable length of context. Speech data and data identifying a candidate transcription for the speech data are received. A phonetic representation for the candidate transcription is accessed. Multiple test sequences are extracted for a particular phone in the phonetic representation. Each of the multiple test sequences includes a different set of contextual phones surrounding the particular phone. Data indicating that an acoustic model includes data corresponding to one or more of the multiple test sequences is received. From among the one or more test sequences, the test sequence that includes the highest number of contextual phones is selected. A score for the candidate transcription is generated based on the data from the acoustic model that corresponds to the selected test sequence.

    SPEECH SYLLABLE/VOWEL/PHONE BOUNDARY DETECTION USING AUDITORY ATTENTION CUES
    9.
    发明公开
    SPEECH SYLLABLE/VOWEL/PHONE BOUNDARY DETECTION USING AUDITORY ATTENTION CUES 审中-公开
    基于聆听功能的语言/银奖/惠誉/电话计数器识别

    公开(公告)号:EP2695160A4

    公开(公告)日:2015-03-18

    申请号:EP11862334

    申请日:2011-11-02

    摘要: In syllable or vowel or phone boundary detection during speech, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more syllable or vowel or phone boundaries in the input window of sound can be detected by mapping the cumulative gist vector to one or more syllable or vowel or phone boundary characteristics using a machine learning algorithm.

    SPEECH RECOGNITION USING VARIABLE-LENGTH CONTEXT
    10.
    发明公开
    SPEECH RECOGNITION USING VARIABLE-LENGTH CONTEXT 有权
    使用可变长度语境的语音识别

    公开(公告)号:EP2727103A2

    公开(公告)日:2014-05-07

    申请号:EP12733579.2

    申请日:2012-06-29

    申请人: Google Inc.

    IPC分类号: G10L15/18

    摘要: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing speech using a variable length of context. Speech data and data identifying a candidate transcription for the speech data are received. A phonetic representation for the candidate transcription is accessed. Multiple test sequences are extracted for a particular phone in the phonetic representation. Each of the multiple test sequences includes a different set of contextual phones surrounding the particular phone. Data indicating that an acoustic model includes data corresponding to one or more of the multiple test sequences is received. From among the one or more test sequences, the test sequence that includes the highest number of contextual phones is selected. A score for the candidate transcription is generated based on the data from the acoustic model that corresponds to the selected test sequence.

    摘要翻译: 包括编码在计算机存储介质上的计算机程序的方法,系统和装置,用于使用可变长度的上下文来识别语音。 语音数据和识别语音数据的候选转录的数据被接收。 候选转录的语音表示被访问。 为语音表示中的特定电话提取多个测试序列。 多个测试序列中的每一个都包括围绕特定电话的不同组的上下文电话。 接收指示声学模型包括与多个测试序列中的一个或多个相对应的数据的数据。 从一个或多个测试序列中,选择包括最多数量的上下文电话的测试序列。 基于来自与所选测试序列相对应的声学模型的数据生成候选转录的分数。