Speech recognizing performing noise adaptation
    1.
    发明公开
    Speech recognizing performing noise adaptation 失效
    斯拉夫肯纳纳

    公开(公告)号:EP0865032A2

    公开(公告)日:1998-09-16

    申请号:EP98301880.5

    申请日:1998-03-13

    IPC分类号: G10L5/06

    CPC分类号: G10L15/20 G10L21/0216

    摘要: It is an object of the invention to remove an additive noise due to an influence or the like by an ambient circumstance in a real-time manner in order to improve precision of a speech recognition which is performed in a real-time manner. For this purpose, a converting process into a noise superimposed speech model is selectively performed to a speech model of a phoneme which is specified by a searching state and a recognition grammar during the search of a speech and can become a search target. A likelihood calculation for recognition of an input speech is executed by using the noise superimposed speech model formed by the converting process.

    摘要翻译: 本发明的一个目的是为了提高以实时方式执行的语音识别的精度,由于环境下的实时影响等而消除加性噪声。 为此,将选择性地执行到噪声叠加语音模型的转换处理到在语音搜索期间由搜索状态和识别语法指定的音素的语音模型,并且可以成为搜索目标。 通过使用由转换处理形成的噪声叠加语音模型来执行用于识别输入语音的似然计算。

    Speech recognition method and apparatus therefor
    3.
    发明公开
    Speech recognition method and apparatus therefor 失效
    语音识别方法及其装置

    公开(公告)号:EP0831456A2

    公开(公告)日:1998-03-25

    申请号:EP97307276.2

    申请日:1997-09-18

    IPC分类号: G10L3/00

    摘要: The present invention aims to provide a high-speed speech recognition method of a high recognition rate, utilizing speaker models.
    For this purpose, the method of this invention executes an acoustic process on the input speech, then calculates a coarse output probability utilizing an unspecified speaker model, and calculates a fine output probability utilizing an unspecified speaker model and clustered speaker models, for the states estimated, by the result of coarse calculation, to contribute to the result of recognition.
    Candidates of recognition are extracted by a common language search based on the obtained result, and a fine language search is conducted on thus extracted candidates to determine the result of recognition.

    摘要翻译: 本发明旨在提供一种利用扬声器模型的高识别率的高速语音识别方法。 为此,本发明的方法对输入语音执行声学处理,然后利用未指定的说话者模型计算粗略输出概率,并利用未指定的说话者模型和聚类说话者模型计算精确输出概率,以估计所估计的状态 ,通过粗略计算的结果,有助于识别的结果。 基于获得的结果通过共同语言搜索来提取识别候选者,并且对这样提取的候选者进行精细语言搜索以确定识别结果。

    Speech recognizing method and apparatus
    4.
    发明公开
    Speech recognizing method and apparatus 失效
    语音识别方法和装置

    公开(公告)号:EP0798695A2

    公开(公告)日:1997-10-01

    申请号:EP97301980.5

    申请日:1997-03-24

    IPC分类号: G10L3/00

    CPC分类号: G10L15/20 G10L25/24

    摘要: Speech including a speech portion and a non-speech portion is inputted, a Cepstrum long time mean of the speech portion is obtained from the speech portion of the input speech, a Cepstrum long time mean of the non-speech portion is obtained from the non-speech portion of the input speech, each Cepstrum long time mean is converted from a Cepstrum region to a linear region, and after that, it is subtracted on a linear spectrum dimension, the subtracted mean is converted into a Cepstrum dimension, a Cepstrum long time mean of a speech portion in a speech database for learning is subtracted from the converted result, and the subtracted result is added to a speech model expressed by Cepstrum. Thus, even when a noise is large, a presuming precision of a line fluctuation is raised and a recognition rate can be improved.

    摘要翻译: 输入包括语音部分和非语音部分的语音,从输入语音的语音部分获得语音部分的倒频谱长时间均值,非语音部分的倒频谱长时间均值从非语音部分 输入语音的语音部分,将每个倒谱长时间均值从倒谱区域转换为线性区域,然后在线性频谱维度上减去倒数均值,将该相减平均值转换为倒谱维度,倒谱长度 从转换的结果中减去用于学习的语音数据库中的语音部分的时间平均,并将相减的结果加到由倒谱表示的语音模型中。 因此,即使在噪声大的情况下,线性变动的推定精度也提高,可以提高识别率。

    Speech processing method and apparatus, and recording medium
    8.
    发明公开
    Speech processing method and apparatus, and recording medium 有权
    佛罗伦萨和Vorrichtung zur Sprachverarbeitung,sowie Aufzeichnungsmedium

    公开(公告)号:EP0977176A2

    公开(公告)日:2000-02-02

    申请号:EP99305952.6

    申请日:1999-07-27

    IPC分类号: G10L15/20

    摘要: The invention intends to successively extract a proper speech zone from a speech inputted in such a fashion that noise is mixed in a speech to be recognized, and to remove noise from the detected speech zone. To this end, a noise position is estimated from an input waveform, a speech zone is detected from a speech inputted subsequently by using power information of a speech at the estimated noise position, and noise is removed from the speech in the detected speech zone by using spectrum information of the speech at the estimated noise position. Further, the estimated noise zone is updated as appropriate by using a result of comparison between the power information of the input speech and the power information of the speech in the estimated noise zone so that the noise position is always properly estimated.

    摘要翻译: 本发明意图从在被识别的语音中混合噪声并从检测到的语音区域去除噪声的方式从输入的语音中连续提取适当的语音区域。 为此,根据输入波形估计噪声位置,通过使用估计噪声位置处的语音的功率信息,随后通过输入的语音检测语音区域,并且通过检测到的语音区域中的语音去除噪声 在估计的噪声位置使用语音的频谱信息。 此外,通过使用输入语音的功率信息和估计噪声区域中的语音的功率信息之间的比较结果,适当地更新估计噪声区域,从而总是适当地估计噪声位置。

    Normalization of speech signals
    9.
    发明公开
    Normalization of speech signals 失效
    正义冯Sprachsignalen

    公开(公告)号:EP0865033A2

    公开(公告)日:1998-09-16

    申请号:EP98301919.1

    申请日:1998-03-13

    IPC分类号: G10L5/06 G10L3/02

    CPC分类号: G10L15/20

    摘要: It is an object of the invention to eliminate an influence by line characteristics in a real-time manner in order to raise recognizing precision of an input speech and to enable the speech to be recognized in a real-time manner. For this purpose, an estimate value of a long-time mean of a parameter is obtained from speech feature parameters which are sequentially inputted by using the speech feature parameters which have already been inputted, and the speech feature parameter inputted at that time point is normalized by using the obtained estimate value. Each time the speech feature parameter is inputted, the latest estimate value is obtained by using the already inputted parameters including the inputted speech feature parameter, and the latest input speech feature parameter is normalized by using the updated estimate value. Since the reliability of the estimate value is higher as the number of speech feature parameters used when the estimate value is obtained is larger, the estimate value is normalized by adding a weight in accordance with the reliability.

    摘要翻译: 本发明的目的是为了提高输入语音的识别精度并且能够以实时方式识别语音来消除线路特性对实时方式的影响。 为此,从通过使用已经输入的语音特征参数顺序输入的语音特征参数获得参数的长时间平均值的估计值,并且在该时间点输入的语音特征参数被归一化 通过使用所获得的估计值。 每当输入语音特征参数时,通过使用已经输入的包括输入的语音特征参数的参数来获得最新估计值,并且通过使用更新的估计值对最新的输入语音特征参数进行归一化。 由于当获得估计值时使用的语音特征参数的数量较大,所以估计值的可靠性较高,所以通过根据可靠性加权来对估计值进行归一化。