Voice activity detector, voice activity detection program, and parameter adjusting method
    21.
    发明授权
    Voice activity detector, voice activity detection program, and parameter adjusting method 有权
    语音活动检测器,语音活动检测程序和参数调整方法

    公开(公告)号:US08812313B2

    公开(公告)日:2014-08-19

    申请号:US13140364

    申请日:2009-12-07

    IPC分类号: G10L21/00

    CPC分类号: G10L25/78 G10L2021/02082

    摘要: Judgment result deriving means 74 makes a judgment between active voice and non-active voice every unit time for a time series of voice data in which the number of active voice segments and the number of non-active voice segments are already known as a number of the labeled active voice segment and a number of the labeled non-active voice segment and shapes active voice segments and non-active voice segments as the result of the judgment by comparing the length of each segment during which the voice data is consecutively judged to correspond to active voice by the judgment or the length of each segment during which the voice data is consecutively judged to correspond to non-active voice by the judgment with a duration threshold. Segments number calculating means 75 calculates the number of active voice segments and the number of non-active voice segments. Duration threshold updating means 76 updates the duration threshold so that the difference between the calculated number of active voice segments and the number of the labeled active voice segments decreases or the difference between the calculated number of non-active voice segments and the number of the labeled non-active voice segments decreases.

    摘要翻译: 判断结果导出装置74对于其中活动语音段的数量和非有效语音段的数量已经被称为数量的语音数据的时间序列,每单位时间对活动语音和非活动语音之间进行判断 标记的活动语音段和多个标记的非活动语音段,并且通过比较连续判断语音数据对应的每个段的长度来作为判断结果来形成活动语音段和非活动语音段 通过判断或通过具有持续时间阈值的判断连续判断语音数据与非活动语音相对应的每个段的长度来激活主动语音。 段数计算装置75计算活动语音段的数量和非有效语音段的数量。 持续时间阈值更新装置76更新持续时间阈值,使得计算的活动语音段数与标记的活动语音段的数量之间的差减小或计算出的非活动语音段数与差标 非主动语音段减少。

    SPEECH PROCESSING APPARATUS, CONTROL METHOD THEREOF, STORAGE MEDIUM STORING CONTROL PROGRAM THEREOF, AND VEHICLE, INFORMATION PROCESSING APPARATUS, AND INFORMATION PROCESSING SYSTEM INCLUDING THE SPEECH PROCESSING APPARATUS
    22.
    发明申请
    SPEECH PROCESSING APPARATUS, CONTROL METHOD THEREOF, STORAGE MEDIUM STORING CONTROL PROGRAM THEREOF, AND VEHICLE, INFORMATION PROCESSING APPARATUS, AND INFORMATION PROCESSING SYSTEM INCLUDING THE SPEECH PROCESSING APPARATUS 有权
    语音处理装置,其控制方法,存储媒体存储控制程序及车辆信息处理装置以及包括语音处理装置的信息处理系统

    公开(公告)号:US20130297303A1

    公开(公告)日:2013-11-07

    申请号:US13979596

    申请日:2011-12-03

    IPC分类号: G10L21/0208

    摘要: An apparatus of this invention is a speech processing apparatus that acquires pseudo speech from a mixture sound including desired speech and noise. The speech processing apparatus includes a first microphone that inputs a first mixture sound including desired speech and noise and outputs a first mixture signal, a second microphone that is opened to the same sound space as that of said first microphone and disposed at a focus position of an interface that is part of a boundary of the sound space and has one of a quadratic surface shape and a pseudo surface shape approximating a quadratic surface, inputs a second mixture sound including the desired speech reflected by the interface and the noise reflected by the interface at a ratio different from the first mixture sound, and outputs a second mixture signal, and a noise suppression circuit that suppresses an estimated noise signal based on the first mixture signal and the second mixture signal and outputs a pseudo speech signal.

    摘要翻译: 本发明的装置是从包含所需语音和噪声的混合声音中获取伪语音的语音处理装置。 语音处理装置包括:第一麦克风,其输入包括所需语音和噪声的第一混合声音,并输出第一混合信号;第二麦克风,其被打开到与所述第一麦克风相同的声音空间,并设置在 作为声音空间的边界的一部分并且具有接近二次曲面的二次曲面形状和假表面形状之一的界面,输入包含由界面反射的期望语音和由界面反射的噪声的第二混合声音 以与第一混合声音不同的比率输出第二混合信号,以及噪声抑制电路,其基于第一混合信号和第二混合信号来抑制估计的噪声信号,并输出伪语音信号。

    SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND COMPUTER READABLE MEDIUM
    23.
    发明申请
    SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND COMPUTER READABLE MEDIUM 有权
    语音识别装置,语音识别方法和计算机可读介质

    公开(公告)号:US20130231929A1

    公开(公告)日:2013-09-05

    申请号:US13883716

    申请日:2011-11-10

    IPC分类号: G10L15/20

    摘要: The present invention can increase the types of noises that can be dealt with enough to enable speech recognition with a speech recognition rate of high accuracy.A speech recognition device of the present invention performs processes of: storing, in a manner to relate them to each other, a suppression coefficient representing a noise suppression amount and an adaptation coefficient representing an adaptation amount of a noise model, where the noise model is generated on the basis of a predetermined noise and is to be compounded (synthesized) to a clean acoustic model generated on the basis of a voice including no noise; estimating noise from an input signal; suppressing from the input signal a portion of the estimated noise of an amount specified by a suppression amount specified on the basis of the suppression coefficient; generating an adapted acoustic model which is noise-adapted, by compounding (synthesizing) the clean acoustic model with a noise model generated on the basis of the estimated noise in accordance with an adaptation amount specified on the basis of the adaptation coefficient; and recognizing voice on the basis of the noise-suppressed input signal and the generated adapted acoustic model.

    摘要翻译: 本发明可以增加可以处理的噪声的类型,以便能够以高精度的语音识别率进行语音识别。 本发明的语音识别装置执行以下处理:以彼此相关的方式存储表示噪声抑制量的抑制系数和表示噪声模型的适应量的适应系数,其中噪声模型为 基于预定的噪声生成,并且将被复合(合成)到基于不包括噪声的声音产生的干净的声学模型; 估计来自输入信号的噪声; 从所述输入信号抑制由基于所述抑制系数指定的抑制量指定的量的估计噪声的一部分; 通过根据基于所述自适应系数指定的自适应量,将干净的声学模型与基于估计的噪声产生的噪声模型进行混合(合成)来产生噪声适应的适应的声学模型; 并基于噪声抑制输入信号和生成的适应声学模型识别语音。

    VOICE RECOGNITION SYSTEM AND VOICE RECOGNITION METHOD
    24.
    发明申请
    VOICE RECOGNITION SYSTEM AND VOICE RECOGNITION METHOD 有权
    语音识别系统和语音识别方法

    公开(公告)号:US20120239401A1

    公开(公告)日:2012-09-20

    申请号:US13514894

    申请日:2010-11-26

    申请人: Takayuki Arakawa

    发明人: Takayuki Arakawa

    IPC分类号: G10L17/00

    CPC分类号: G10L25/87 G10L15/04

    摘要: Provided is a voice recognition system capable of, while suppressing negative influences from sound not to be recognized, correctly estimating utterance sections that are to be recognized. A voice segmenting means calculates voice feature values, and segments voice sections or non-voice sections by comparing the voice feature values with a threshold value. Then, the voice segmenting means determines, to be first voice sections, those segmented sections or sections obtained by adding a margin to the front and rear of each of those segmented sections. On the basis of voice and non-voice likelihoods, a search means determines, to be second voice sections, sections to which voice recognition is to be applied. A parameter updating means updates the threshold value and the margin. The voice segmenting means determines the first voice sections by using the one of the threshold value and the margin which has been updated by the parameter updating means.

    摘要翻译: 提供了一种语音识别系统,其能够在抑制来自声音的不被识别的负面影响的同时,正确地估计将被识别的话音部分。 语音分割装置计算语音特征值,并且通过将语音特征值与阈值进行比较来分割语音部分或非语音部分。 然后,语音分割装置将通过在每个这些分割部分的前后添加余量而获得的那些分段部分确定为第一语音部分。 基于语音和非话音可能性,搜索装置确定要被应用语音识别的部分,作为第二语音部分。 参数更新装置更新阈值和余量。 语音分段装置通过使用由参数更新装置更新的阈值和余量中的一个来确定第一语音部分。

    SYSTEM, METHOD AND PROGRAM FOR VOICE DETECTION
    25.
    发明申请
    SYSTEM, METHOD AND PROGRAM FOR VOICE DETECTION 有权
    用于语音检测的系统,方法和程序

    公开(公告)号:US20100268532A1

    公开(公告)日:2010-10-21

    申请号:US12744671

    申请日:2008-11-26

    IPC分类号: G10L11/06

    CPC分类号: G10L25/93

    摘要: A system for voice detection includes a feature value calculation unit that calculates a feature value from an input signal sliced on a per frame basis, a provisional voice/non-voice decision unit that provisionally decides a voiced interval and a non-voiced interval from the feature value calculated on a per frame basis, and a voice/non-voice decision unit that determines a voiced interval duration threshold value or a non-voiced interval duration threshold value, using a ratio of the feature value found on a per frame basis to a threshold value for the feature value and that re-decides the voiced interval and the non-voiced interval, using the voiced interval duration threshold value determined and the non-voiced interval duration threshold value determined. By determining the voiced interval duration threshold value and the non-voiced interval duration threshold value, using the feature value found on a per frame basis and the threshold value for the feature value, the constraint of the shaping rule may be made weaker, or stronger in case the feature value found on a per frame basis can be regarded as being reliable or not, thereby allowing voice detection to be made without dependency upon a noise environment.

    摘要翻译: 一种用于语音检测的系统包括:特征值计算单元,其基于以每帧为基准的输入信号计算特征值;临时语音/非语音判定单元,其从所述临时语音/非语音判定单元临时确定有声间隔和非语音间隔, 基于每帧计算的特征值;以及语音/非语音决定单元,其使用在每帧基础上找到的特征值的比率来确定有声间隔持续时间阈值或非有声间隔持续时间阈值 使用确定的有声间隔持续时间阈值和确定的非有声间隔持续时间阈值来重新确定特征值的阈值并重新确定有声间隔和非语音间隔。 通过使用基于每帧的特征值和特征值的阈值来确定浊音间隔持续时间阈值和非有声间隔持续时间阈值,可以使成形规则的约束变弱或更强 在每帧基础上发现的特征值可以被认为是可靠的情况下,从而允许在不依赖于噪声环境的情况下进行语音检测。

    Speech recognition device, speech recognition method, and computer readable medium
    27.
    发明授权
    Speech recognition device, speech recognition method, and computer readable medium 有权
    语音识别装置,语音识别方法和计算机可读介质

    公开(公告)号:US09245524B2

    公开(公告)日:2016-01-26

    申请号:US13883716

    申请日:2011-11-10

    摘要: The present invention can increase the types of noises that can be dealt with enough to enable speech recognition with a speech recognition rate of high accuracy.A speech recognition device of the present invention performs processes of: storing, in a manner to relate them to each other, a suppression coefficient representing a noise suppression amount and an adaptation coefficient representing an adaptation amount of a noise model, where the noise model is generated on the basis of a predetermined noise and is to be compounded (synthesized) to a clean acoustic model generated on the basis of a voice including no noise; estimating noise from an input signal; suppressing from the input signal a portion of the estimated noise of an amount specified by a suppression amount specified on the basis of the suppression coefficient; generating an adapted acoustic model which is noise-adapted, by compounding (synthesizing) the clean acoustic model with a noise model generated on the basis of the estimated noise in accordance with an adaptation amount specified on the basis of the adaptation coefficient; and recognizing voice on the basis of the noise-suppressed input signal and the generated adapted acoustic model.

    摘要翻译: 本发明可以增加可以处理的噪声的类型,以便能够以高精度的语音识别率进行语音识别。 本发明的语音识别装置执行以下处理:以彼此相关的方式存储表示噪声抑制量的抑制系数和表示噪声模型的适应量的适应系数,其中噪声模型为 基于预定的噪声生成,并且将被复合(合成)到基于不包括噪声的声音产生的干净的声学模型; 估计来自输入信号的噪声; 从所述输入信号抑制由基于所述抑制系数指定的抑制量指定的量的估计噪声的一部分; 通过根据基于所述自适应系数指定的自适应量,将干净的声学模型与基于估计的噪声产生的噪声模型进行混合(合成)来产生噪声适应的适应的声学模型; 并基于噪声抑制输入信号和生成的适应声学模型识别语音。

    Voice recognition system and voice recognition method
    28.
    发明授权
    Voice recognition system and voice recognition method 有权
    语音识别系统和语音识别方法

    公开(公告)号:US09002709B2

    公开(公告)日:2015-04-07

    申请号:US13514894

    申请日:2010-11-26

    申请人: Takayuki Arakawa

    发明人: Takayuki Arakawa

    IPC分类号: G10L17/00 G10L25/87 G10L15/04

    CPC分类号: G10L25/87 G10L15/04

    摘要: Provided is a voice recognition system capable of, while suppressing negative influences from sound not to be recognized, correctly estimating utterance sections that are to be recognized. A voice segmenting means calculates voice feature values, and segments voice sections or non-voice sections by comparing the voice feature values with a threshold value. Then, the voice segmenting means determines, to be first voice sections, those segmented sections or sections obtained by adding a margin to the front and rear of each of those segmented sections. On the basis of voice and non-voice likelihoods, a search means determines, to be second voice sections, sections to which voice recognition is to be applied. A parameter updating means updates the threshold value and the margin. The voice segmenting means determines the first voice sections by using the one of the threshold value and the margin which has been updated by the parameter updating means.

    摘要翻译: 提供了一种语音识别系统,其能够在抑制来自声音的不被识别的负面影响的同时,正确地估计将被识别的话音部分。 语音分割装置计算语音特征值,并且通过将语音特征值与阈值进行比较来分割语音部分或非语音部分。 然后,语音分割装置将通过在每个这些分割部分的前后添加余量而获得的那些分段部分确定为第一语音部分。 基于语音和非话音可能性,搜索装置确定要被应用语音识别的部分,作为第二语音部分。 参数更新装置更新阈值和余量。 语音分段装置通过使用由参数更新装置更新的阈值和余量中的一个来确定第一语音部分。

    Gain control system, gain control method, and gain control program
    29.
    发明授权
    Gain control system, gain control method, and gain control program 有权
    增益控制系统,增益控制方法和增益控制程序

    公开(公告)号:US08401844B2

    公开(公告)日:2013-03-19

    申请号:US12227902

    申请日:2007-01-16

    IPC分类号: G10L19/12 G10L19/14 H03G3/00

    CPC分类号: G10L15/065 G10L2015/025

    摘要: Disclosed is a gain control system in which speech model constituted from a sound pressure and a feature is stored in a speech model storage unit for each of a plurality of phonemes or for each of clusters into which a speech is divided. When an input signal is given, a feature conversion unit calculates a feature and a sound pressure of the input signal. A sound pressure comparison unit determines a sound pressure ratio between the input signal and each of speech models. A distance calculation unit calculates a distance between the feature of the input signal and the feature of each of the speech models. A gain calculation unit calculates a gain value from the sound pressure ratio and information on the distance. A sound pressure compensation unit thereby compensates for the sound pressure of the input signal.

    摘要翻译: 公开了一种增益控制系统,其中由声压和特征构成的语音模型被存储在用于多个音素中的每一个的语音模型存储单元中,或者对于分为语音的每个簇。 当给出输入信号时,特征转换单元计算输入信号的特征和声压。 声压比较单元确定输入信号和每个语音模型之间的声压比。 距离计算单元计算输入信号的特征与每个语音模型的特征之间的距离。 增益计算单元根据声压比和关于距离的信息来计算增益值。 声压补偿单元由此补偿输入信号的声压。

    VOICE RECOGNITION DEVICE, VOICE RECOGNITION METHOD, AND VOICE RECOGNITION PROGRAM
    30.
    发明申请
    VOICE RECOGNITION DEVICE, VOICE RECOGNITION METHOD, AND VOICE RECOGNITION PROGRAM 有权
    语音识别设备,语音识别方法和语音识别程序

    公开(公告)号:US20100070277A1

    公开(公告)日:2010-03-18

    申请号:US12528022

    申请日:2008-02-26

    IPC分类号: G10L17/00

    CPC分类号: G10L15/02

    摘要: A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit.

    摘要翻译: 识别输入语音信号的语音的语音识别装置包括:语音模型存储单元,其预先存储具有多个细节级别的预定语音模型,所述多个细节级别是指示语音的特征属性的信息, 语音模型; 详细级别选择单元,从存储在语音模型存储单元中的语音模型的详细级别中选择最接近输入语音信号的特征属性的细节级别; 以及参数设置单元,其根据由细节级选择单元选择的详细级别设置用于识别输入语音的语音的参数。