Multi-stage speech recognition apparatus and method
    2.
    发明授权
    Multi-stage speech recognition apparatus and method 有权
    多级语音识别装置及方法

    公开(公告)号:US08762142B2

    公开(公告)日:2014-06-24

    申请号:US11889665

    申请日:2007-08-15

    IPC分类号: G10L15/02 G10L15/16 G10L15/32

    CPC分类号: G10L15/32 G10L15/02 G10L15/16

    摘要: Provided are a multi-stage speech recognition apparatus and method. The multi-stage speech recognition apparatus includes a first speech recognition unit performing initial speech recognition on a feature vector, which is extracted from an input speech signal, and generating a plurality of candidate words; and a second speech recognition unit rescoring the candidate words, which are provided by the first speech recognition unit, using a temporal posterior feature vector extracted from the speech signal.

    摘要翻译: 提供了一种多级语音识别装置和方法。 多级语音识别装置包括:第一语音识别单元,对从输入语音信号提取的特征向量进行初始语音识别,生成多个候选词; 以及第二语音识别单元,使用从所述语音信号提取的时间后向特征向量,对由所述第一语音识别单元提供的候选词进行重新排序。

    Multi-stage speech recognition apparatus and method
    4.
    发明申请
    Multi-stage speech recognition apparatus and method 有权
    多级语音识别装置及方法

    公开(公告)号:US20080208577A1

    公开(公告)日:2008-08-28

    申请号:US11889665

    申请日:2007-08-15

    IPC分类号: G10L15/00

    CPC分类号: G10L15/32 G10L15/02 G10L15/16

    摘要: Provided are a multi-stage speech recognition apparatus and method. The multi-stage speech recognition apparatus includes a first speech recognition unit performing initial speech recognition on a feature vector, which is extracted from an input speech signal, and generating a plurality of candidate words; and a second speech recognition unit rescoring the candidate words, which are provided by the first speech recognition unit, using a temporal posterior feature vector extracted from the speech signal.

    摘要翻译: 提供了一种多级语音识别装置和方法。 多级语音识别装置包括:第一语音识别单元,对从输入语音信号提取的特征向量进行初始语音识别,生成多个候选词; 以及第二语音识别单元,使用从所述语音信号提取的时间后向特征向量,对由所述第一语音识别单元提供的候选词进行重新排序。

    Apparatus and method for detecting voice activity period
    6.
    发明授权
    Apparatus and method for detecting voice activity period 有权
    检测语音活动期的装置和方法

    公开(公告)号:US07711558B2

    公开(公告)日:2010-05-04

    申请号:US11472304

    申请日:2006-06-22

    IPC分类号: G10L21/02

    CPC分类号: G10L25/78

    摘要: An apparatus and method for detecting a voice activity period. The apparatus for detecting a voice activity period includes a domain conversion module that converts an input signal into a frequency domain signal in the unit of a frame obtained by dividing the input signal at predetermined intervals, a subtracted-spectrum-generation module that generates a spectral subtraction signal which is obtained by subtracting a predetermined noise spectrum from the converted frequency domain signal, a modeling module that applies the spectral subtraction signal to a predetermined probability distribution model, and a speech-detection module that determines whether a speech signal is present in a current frame through a probability distribution calculated by the modeling module.

    摘要翻译: 一种用于检测语音活动期的装置和方法。 用于检测语音活动期间的装置包括域转换模块,该域转换模块将输入信号转换成以预定间隔划分输入信号所获得的帧为单位的频域信号;产生频谱的减法频谱生成模块 通过从转换的频域信号中减去预定的噪声频谱获得的减法信号,将频谱减法信号应用于预定概率分布模型的建模模块,以及确定语音信号是否存在于语音信号中的语音检测模块 通过由建模模块计算出的概率分布的当前帧。

    Apparatus and method for detecting voice activity period
    7.
    发明申请
    Apparatus and method for detecting voice activity period 有权
    检测语音活动期的装置和方法

    公开(公告)号:US20070073537A1

    公开(公告)日:2007-03-29

    申请号:US11472304

    申请日:2006-06-22

    IPC分类号: G10L15/20

    CPC分类号: G10L25/78

    摘要: An apparatus and method for detecting a voice activity period. The apparatus for detecting a voice activity period includes a domain conversion module that converts an input signal into a frequency domain signal in the unit of a frame obtained by dividing the input signal at predetermined intervals, a subtracted-spectrum-generation module that generates a spectral subtraction signal which is obtained by subtracting a predetermined noise spectrum from the converted frequency domain signal, a modeling module that applies the spectral subtraction signal to a predetermined probability distribution model, and a speech-detection module that determines whether a speech signal is present in a current frame through a probability distribution calculated by the modeling module.

    摘要翻译: 一种用于检测语音活动期的装置和方法。 用于检测语音活动期间的装置包括域转换模块,该域转换模块将输入信号转换成以预定间隔划分输入信号所获得的帧为单位的频域信号;产生频谱的减法频谱生成模块 通过从转换的频域信号中减去预定的噪声频谱获得的减法信号,将频谱减法信号应用于预定概率分布模型的建模模块,以及确定语音信号是否存在于语音信号中的语音检测模块 通过由建模模块计算的概率分布的当前帧。

    Apparatus for positioning screen sound source, method of generating loudspeaker set information, and method of reproducing positioned screen sound source
    8.
    发明授权
    Apparatus for positioning screen sound source, method of generating loudspeaker set information, and method of reproducing positioned screen sound source 有权
    用于定位屏幕声源的装置,产生扬声器组信息的方法,以及再现定位的屏幕声源的方法

    公开(公告)号:US08208663B2

    公开(公告)日:2012-06-26

    申请号:US12482883

    申请日:2009-06-11

    IPC分类号: H04R5/02

    CPC分类号: H04R5/04

    摘要: An apparatus for positioning a screen sound source, a method of generating loudspeaker set information for screen sound source positioning, and a method of reproducing a positioned screen sound source are provided. The apparatus and methods relate to a screen sound source positioning technique. A plurality of loudspeakers, each configured to have approximately the same gain, are each disposed proximate to the edge of a display, and a loudspeaker set including at least two of the loudspeakers is selected to position a virtual sound source substantially synchronized with a visual object displayed at a position on the screen of the display. Accordingly, a virtual sound source may be positioned at a certain specific position on the screen of a display without sound source distortion.

    摘要翻译: 提供一种用于定位屏幕声源的装置,一种产生用于屏幕声源定位的扬声器组信息的方法以及再现定位的屏幕声源的方法。 该装置和方法涉及屏幕声源定位技术。 每个配置成具有近似相同增益的多个扬声器各自设置在显示器的边缘附近,并且选择包括至少两个扬声器的扬声器组,以将基本上与视觉对象同步的虚拟声源定位 显示在显示屏的屏幕上的位置。 因此,虚拟声源可以位于显示器的屏幕上的某个特定位置,而没有声源失真。

    APPARATUS FOR POSITIONING SCREEN SOUND SOURCE, METHOD OF GENERATING LOUDSPEAKER SET INFORMATION, AND METHOD OF REPRODUCING POSITIONED SCREEN SOUND SOURCE
    9.
    发明申请
    APPARATUS FOR POSITIONING SCREEN SOUND SOURCE, METHOD OF GENERATING LOUDSPEAKER SET INFORMATION, AND METHOD OF REPRODUCING POSITIONED SCREEN SOUND SOURCE 有权
    用于定位屏幕声源的装置,产生扬声器组信息的方法和再现定位屏幕声源的方法

    公开(公告)号:US20100111336A1

    公开(公告)日:2010-05-06

    申请号:US12482883

    申请日:2009-06-11

    IPC分类号: H04R5/02

    CPC分类号: H04R5/04

    摘要: An apparatus for positioning a screen sound source, a method of generating loudspeaker set information for screen sound source positioning, and a method of reproducing a positioned screen sound source are provided. The apparatus and methods relate to a screen sound source positioning technique. A plurality of loudspeakers, each configured to have approximately the same gain, are each disposed proximate to the edge of a display, and a loudspeaker set including at least two of the loudspeakers is selected to position a virtual sound source substantially synchronized with a visual object displayed at a position on the screen of the display. Accordingly, a virtual sound source may be positioned at a certain specific position on the screen of a display without sound source distortion.

    摘要翻译: 提供一种用于定位屏幕声源的装置,一种产生用于屏幕声源定位的扬声器组信息的方法以及再现定位的屏幕声源的方法。 该装置和方法涉及屏幕声源定位技术。 每个配置成具有近似相同增益的多个扬声器各自设置在显示器的边缘附近,并且选择包括至少两个扬声器的扬声器组,以将基本上与视觉对象同步的虚拟声源定位 显示在显示屏的屏幕上的位置。 因此,虚拟声源可以位于显示器的屏幕上的某个特定位置,而没有声源失真。

    Apparatus and method for speech recognition using a plurality of confidence score estimation algorithms
    10.
    发明申请
    Apparatus and method for speech recognition using a plurality of confidence score estimation algorithms 有权
    使用多个置信分数估计算法进行语音识别的装置和方法

    公开(公告)号:US20070136058A1

    公开(公告)日:2007-06-14

    申请号:US11517369

    申请日:2006-09-08

    IPC分类号: G10L15/00

    CPC分类号: G10L15/08 G10L2015/088

    摘要: An apparatus for speech recognition includes: a first confidence score calculator calculating a first confidence score using a ratio between a likelihood of a keyword model for feature vectors per frame of a speech signal and a likelihood of a Filler model for the feature vectors; a second confidence score calculator calculating a second confidence score by comparing a Gaussian distribution trace of the keyword model per frame of the speech signal with a Gaussian distribution trace sample of a stored corresponding keyword of the keyword model; and a determination module determining a confidence of a result using the keyword model in accordance with a position determined by the first and second confidence scores on a confidence coordinate system.

    摘要翻译: 一种用于语音识别的装置包括:第一置信度分数计算器,使用针对每个语音信号的每个特征向量的关键字模型的似然率与特征向量的填充模型的似然率之间的比率来计算第一置信度分数; 第二置信度计算器通过将所述语音信号的每帧的关键字模型的高斯分布轨迹与所述关键字模型的存储的对应关键字的高斯分布轨迹样本进行比较来计算第二置信度分数; 以及确定模块,其根据由置信坐标系上的第一和第二置信度得分确定的位置,使用关键字模型确定结果的置信度。