Apparatus, method, and medium for distinguishing vocal sound from other sounds
    1.
    发明授权
    Apparatus, method, and medium for distinguishing vocal sound from other sounds 失效
    用于区分声音与其他声音的装置,方法和介质

    公开(公告)号:US08078455B2

    公开(公告)日:2011-12-13

    申请号:US11051475

    申请日:2005-02-07

    CPC classification number: G10L25/93

    Abstract: An apparatus, method, and medium for distinguishing a vocal sound. The apparatus includes: a framing unit dividing an input signal into frames, each frame having a predetermined length; a pitch extracting unit determining whether each frame is a voiced frame or an unvoiced frame and extracting a pitch contour from the voiced and unvoiced frames; a zero-cross rate calculator respectively calculating a zero-cross rate for each frame; a parameter calculator calculating parameters including a time length ratio of the voiced frame and the unvoiced frame determined by the pitch extracting unit, statistical information of the pitch contour, and spectral characteristics; and a classifier inputting the zero-cross rates and the parameters output from the parameter calculator and determining whether the input signal is a vocal sound.

    Abstract translation: 用于区分声音的装置,方法和介质。 该装置包括:成帧单元,将输入信号分成帧,每帧具有预定长度; 音调提取单元,确定每个帧是有声帧还是无声帧,并从有声和无声帧提取音调轮廓; 分别计算每帧的零交叉率的零交叉率计算器; 计算包括由音调提取单元确定的有声帧和无声帧的时间长度比的参数,音调轮廓的统计信息和频谱特性的参数计算器; 以及输入从参数计算器输出的零交叉率和参数的分类器,并确定输入信号是否是声音。

    Formant tracking apparatus and formant tracking method
    2.
    发明授权
    Formant tracking apparatus and formant tracking method 失效
    前向跟踪装置和共振峰跟踪方法

    公开(公告)号:US07756703B2

    公开(公告)日:2010-07-13

    申请号:US11247219

    申请日:2005-10-12

    CPC classification number: G10L25/48 G10L25/15 G10L2025/906

    Abstract: A formant tracking apparatus and a formant tracking method are provided. The formant tracking apparatus includes: a framing unit dividing an input voice signal into a plurality of frames; a linear prediction analyzing unit obtaining linear prediction coefficients for each frame; a segmentation unit segmenting each of the linear prediction coefficients into a plurality of segments; a formant candidate determining unit obtaining formant candidates by using the linear prediction coefficients, and summing the formant candidates for each segment to determine formant candidates for each segment; a formant number determining unit determining a number of tracking formants for each segment among the formant candidates satisfying a predetermined condition; and a tracking unit searching the tracking formants as many as the number of the tracking formants determined in the formant number determining unit among the formant candidates belonging to each segment.

    Abstract translation: 提供共振峰跟踪装置和共振峰跟踪方法。 共振峰跟踪装置包括:成帧单元,将输入的语音信号分成多个帧; 线性预测分析单元,用于获得每帧的线性预测系数; 分割单元,将每个所述线性预测系数分割为多个分段; 通过使用线性预测系数获得共振峰候选的共振峰候选确定单元,并且对每个分段对共振峰候选进行求和,以确定每个分段的共振峰候选; 共振峰数确定单元,确定满足预定条件的共振峰候补中每一段的跟踪共振峰数; 以及跟踪单元,搜索与属于每个段的共振峰候选者中的在共振峰数确定单元中确定的跟踪共轭体的数量一样多的跟踪共振峰。

    Method and apparatus for estimating pitch of signal
    3.
    发明授权
    Method and apparatus for estimating pitch of signal 有权
    用于估计信号音调的方法和装置

    公开(公告)号:US07672836B2

    公开(公告)日:2010-03-02

    申请号:US11247277

    申请日:2005-10-12

    CPC classification number: G10L25/90

    Abstract: A pitch estimating method and apparatus in which mixture Gaussian distributions based on candidate pitches having high period estimating values are generated, a mixture Gaussian distribution having a high likelihood is selected and dynamic programming is executed so that the pitch of the speech signal can be accurately estimated. The pitch estimating method comprises computing a normalized autocorrelation function of a windowed signal obtained by multiplying a frame of a speech signal by a window signal and determining candidate pitches from a peak value of the normalized autocorrelation function of the windowed signal, interpolating a period of the determined candidate pitches and a period estimating value representing a length of the period, generating Gaussian distributions for the candidate pitches for each frame for which the interpolated period estimating value is greater than a first threshold value, mixing the Gaussian distributions which are located at a distance less than a second threshold value to generate mixture Gaussian distributions and selecting at least one of the mixture Gaussian distributions that a likelihood exceeding a third threshold value, and executing dynamic programming for the frames to estimate the pitch of each frame, based on the candidate pitches of each of the frames and the selected mixture Gaussian distributions.

    Abstract translation: 一种音调估计方法和装置,其中生成具有高周期估计值的候选音调的混合高斯分布,选择具有高可能性的混合高斯分布,并且执行动态编程,使得语音信号的音调能够被精确估计 。 音调估计方法包括:计算通过将语音信号的帧乘以窗口信号而获得的加窗信号的标准化自相关函数,并根据窗口化信号的归一化自相关函数的峰值确定候选音高,内插 确定候选间距和表示周期长度的周期估计值,为内插周期估计值大于第一阈值的每个帧产生候选间距的高斯分布,将位于距离处的高斯分布混合 小于第二阈值以产生混合高斯分布,并且选择可能性超过第三阈值的混合高斯分布中的至少一个,并且基于候选音调执行帧估计每帧的音高的动态规划 的每个框架和se 选择混合高斯分布。

Patent Agency Ranking