Voiced/unvoiced speech classifier
    1.
    发明授权
    Voiced/unvoiced speech classifier 有权
    有声/无声语音分类器

    公开(公告)号:US06640208B1

    公开(公告)日:2003-10-28

    申请号:US09659318

    申请日:2000-09-12

    IPC分类号: G10L1106

    CPC分类号: G10L25/93

    摘要: A voiced/unvoiced speech classifier (30) includes a speech segmentor (34) which segments an input digitized speech waveform into frames of speech and a band-pass filter (36) which filters the frames of speech. A relative energy generator (38) generates a relative energy value for each filtered frame of speech and a decision parameter generator (52) including an autocorrelation calculator (54) and a pitch calculator (56) generates a decision parameter based on an autocorrelation function and a pitch frequency index for the filtered frames of speech. A normalized energy calculator (46) adjusts the threshold and then normalizes the relative energy. A comparator (60) provides a signal indicative of whether a frame of speech is voiced speech or unvoiced speech depending on a comparison of the decision parameter and the normalized relative energy value for each filtered frame of speech.

    摘要翻译: 有声/无声语音分类器(30)包括将输入的数字化语音波形分成语音帧的语音分割器(34)和对语音帧进行滤波的带通滤波器(36)。 相对能量发生器(38)为每个经滤波的语音帧产生相对能量值,并且包括自相关计算器(54)和音高计算器(56)的判定参数发生器(52)基于自相关函数产生决策参数,并且 用于滤波的语音帧的音调频率索引。 归一化能量计算器(46)调整阈值,然后使相对能量归一化。 比较器(60)根据决定参数与每个被滤波的语音帧的归一化相对能量值的比较,提供指示语音帧是语音语音还是无声语音的信号。

    Tone based speech recognition
    2.
    发明授权
    Tone based speech recognition 有权
    基于语音识别

    公开(公告)号:US06553342B1

    公开(公告)日:2003-04-22

    申请号:US09496868

    申请日:2000-02-02

    IPC分类号: G10L1502

    CPC分类号: G10L15/02 G10L25/15

    摘要: A method and apparatus for speech recognition involves classifying (38) a digitized speech segment according to whether the speech segment comprises voiced or unvoiced speech and utilizing that classification to generate tonal feature vectors (41) of the speech segment when the speech is voiced. The tonal feature vectors are then combined (42) with other non-tonal feature vectors (40) to provide speech feature vectors. The speech feature vectors are compared (35) with previously stored models of speech feature vectors (37) for different segments of speech to determine which previously stored model is a most likely match for the segment to be recognized.

    摘要翻译: 用于语音识别的方法和装置涉及根据语音段是否包括有声或无声语音来分类(38)数字化语音段,并且当语音被语音时利用该分类来生成语音段的音调特征向量(41)。 然后将音调特征向量与其他非音调特征向量(40)组合(42)以提供语音特征向量。 将语音特征向量与先前存储的用于不同语音段的语音特征向量(37)的模型进行比较(35),以确定先前存储的模型是否将被识别的段最可能匹配。

    Method and apparatus of increasing speech intelligibility in noisy environments
    4.
    发明申请
    Method and apparatus of increasing speech intelligibility in noisy environments 有权
    在嘈杂环境中增加语音清晰度的方法和设备

    公开(公告)号:US20060270467A1

    公开(公告)日:2006-11-30

    申请号:US11137182

    申请日:2005-05-25

    IPC分类号: H04B1/38 H04M1/00

    摘要: A method (400, 600, 700) and apparatus (220) for enhancing the intelligibility of speech emitted into a noisy environment. After filtering (408) ambient noise with a filter (304) that simulates the physical blocking of noise by a at least a part of a voice communication device (102) a frequency dependent SNR of received voice audio relative to ambient noise is computed (424) on a perceptual (e.g. Bark) frequency scale. Formants are identified (426, 600, 700) and the SNR in bands including certain formants are modified (508, 510) with formant enhancement gain factors in order to improve intelligibility. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains factors yielding combined gains which are clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532) and used to reconstruct (532, 534) an audio signal.

    摘要翻译: 一种用于增强发射到嘈杂环境中的语音的可懂度的方法(400,600,700)和装置(220)。 在利用模拟语音通信设备(102)的至少一部分来模拟噪声的物理阻塞的滤波器(304)对环境噪声进行滤波(408)之后,计算接收到的语音音频相对于环境噪声的频率相关SNR(424 )在感知(例如树皮)频率标度上。 识别共振峰(426,600,700),并且包括某些共振峰的频带中的SNR被修改(508,510),具有共振峰增强增益因子,以便提高可懂度。 将一组高通滤波器增益(338)与共振峰增强增益因子组合(516),产生组合增益,该组合增益根据总SNR进行削波(518),缩放(520),标准化(526),跨越时间平滑 530)和频率(532),并用于重建(532,534)音频信号。

    APPARATUS AND METHOD FOR NOISE REMOVAL
    5.
    发明申请
    APPARATUS AND METHOD FOR NOISE REMOVAL 有权
    噪声去除装置和方法

    公开(公告)号:US20130158989A1

    公开(公告)日:2013-06-20

    申请号:US13330235

    申请日:2011-12-19

    IPC分类号: G10L21/02

    摘要: A continuous stream of noise is created from a plurality of input signals. A smoothing spectrum estimate is continuously calculated from the continuous stream of noise. Noise is responsively removed from a selected one of the plurality of input signals using the smoothing spectrum estimate. The removal of the noise from the selected input signal is performed substantially synchronously and in time alignment with the creating of the continuous stream of noise and the calculating of the smoothing spectrum estimate.

    摘要翻译: 从多个输入信号产生连续的噪声流。 从连续的噪声流连续计算平滑频谱估计。 使用平滑频谱估计从多个输入信号中的所选择的一个响应地去除噪声。 从所选择的输入信号中去除噪声基本上同步地进行,并且与连续的噪声流的产生以及平滑频谱估计的计算在时间上一致。

    Method of refining statistical pattern recognition models and statistical pattern recognizers
    6.
    发明申请
    Method of refining statistical pattern recognition models and statistical pattern recognizers 有权
    统计模式识别模型和统计模式识别方法

    公开(公告)号:US20060136205A1

    公开(公告)日:2006-06-22

    申请号:US11018271

    申请日:2004-12-21

    申请人: Jianming Song

    发明人: Jianming Song

    IPC分类号: G10L15/06

    摘要: A device (800) performs statistical pattern recognition using model parameters that are refined by optimizing an objective function that includes a term for many items of training data for which recognition errors occur wherein each term depends on a relative magnitude of a first score for a recognition result for an item of training data and a second score calculated by evaluating a statistical pattern recognition model identified by a transcribed identity of the training data item with feature vectors extracted from the item of training data. The objective function does not include terms for items of training data for which there is a gross discrepancy between a transcribed identity and a recognized identity. Gross discrepancies can be detected by probability score or pattern identity comparisons. Terms, of the objective function are weighted based on the type of recognition error and weights can be increased for high priority patterns.

    摘要翻译: 设备(800)使用通过优化目标函数来改进的模型参数来执行统计模式识别,所述目标函数包括用于识别错误发生的许多训练数据项的项,其中每个项取决于用于识别的第一分数的相对大小 通过从训练数据项目提取的特征向量评估由训练数据项的转录身份识别的统计模式识别模型而计算出的训练数据项目和第二分数。 目标函数不包括训练数据项,其中转录身份与识别身份之间存在严重差异。 总差异可以通过概率分数或模式识别比较来检测。 根据识别误差的类型对目标函数的术语进行加权,对于高优先级模式,可以增加权重。

    Cohort model selection apparatus and method
    7.
    发明授权
    Cohort model selection apparatus and method 失效
    队列模型选择装置及方法

    公开(公告)号:US06393397B1

    公开(公告)日:2002-05-21

    申请号:US09332927

    申请日:1999-06-14

    IPC分类号: G10L1506

    CPC分类号: G10L17/04 G10L17/12

    摘要: An apparatus for selecting a cohort model for use in a speaker verification system includes a model generator (108) for determining a target speaker model (114) from a speech sample collected from the target speaker (106). A cohort selector (110) determines a similarity value between each of a number of predetermined existing speaker models from a model pool (112) and the target speaker model (114) and a dissimilarity value between each of the existing speaker models and any previously selected cohort models (116). An existing speaker model which is most similar to the target speaker model, but most dissimilar to previously chosen cohort models, is then chosen as another cohort model for the target speaker.

    摘要翻译: 一种用于选择在扬声器验证系统中使用的队列模型的装置包括:模型发生器(108),用于从从目标扬声器(106)收集的语音样本中确定目标说话者模型(114)。 队列选择器(110)确定来自模型池(112)和目标说话者模型(114)的多个预定的现有说话者模型中的每一个之间的相似度值,以及现有说话者模型中的每一者与之前选择的任何一个之间的相似度值 队列模型(116)。 然而,与目标说话者模型最相似但与以前选择的队列模型最相似的现有说话者模型被选择为目标说话者的另一队列模型。

    Apparatus and method for noise removal by spectral smoothing
    10.
    发明授权
    Apparatus and method for noise removal by spectral smoothing 有权
    通过光谱平滑噪声消除的装置和方法

    公开(公告)号:US08712769B2

    公开(公告)日:2014-04-29

    申请号:US13330235

    申请日:2011-12-19

    IPC分类号: G10L21/0232

    摘要: A continuous stream of noise is created from a plurality of input signals. A smoothing spectrum estimate is continuously calculated from the continuous stream of noise. Noise is responsively removed from a selected one of the plurality of input signals using the smoothing spectrum estimate. The removal of the noise from the selected input signal is performed substantially synchronously and in time alignment with the creating of the continuous stream of noise and the calculating of the smoothing spectrum estimate.

    摘要翻译: 从多个输入信号产生连续的噪声流。 从连续的噪声流连续计算平滑频谱估计。 使用平滑频谱估计从多个输入信号中的所选择的一个响应地去除噪声。 从所选择的输入信号中去除噪声基本上同步地进行,并且与连续的噪声流的产生以及平滑频谱估计的计算在时间上一致。