Method and apparatus for estimating pitch of signal
    1.
    发明授权
    Method and apparatus for estimating pitch of signal 有权
    用于估计信号音调的方法和装置

    公开(公告)号:US07672836B2

    公开(公告)日:2010-03-02

    申请号:US11247277

    申请日:2005-10-12

    IPC分类号: G10L11/04

    CPC分类号: G10L25/90

    摘要: A pitch estimating method and apparatus in which mixture Gaussian distributions based on candidate pitches having high period estimating values are generated, a mixture Gaussian distribution having a high likelihood is selected and dynamic programming is executed so that the pitch of the speech signal can be accurately estimated. The pitch estimating method comprises computing a normalized autocorrelation function of a windowed signal obtained by multiplying a frame of a speech signal by a window signal and determining candidate pitches from a peak value of the normalized autocorrelation function of the windowed signal, interpolating a period of the determined candidate pitches and a period estimating value representing a length of the period, generating Gaussian distributions for the candidate pitches for each frame for which the interpolated period estimating value is greater than a first threshold value, mixing the Gaussian distributions which are located at a distance less than a second threshold value to generate mixture Gaussian distributions and selecting at least one of the mixture Gaussian distributions that a likelihood exceeding a third threshold value, and executing dynamic programming for the frames to estimate the pitch of each frame, based on the candidate pitches of each of the frames and the selected mixture Gaussian distributions.

    摘要翻译: 一种音调估计方法和装置,其中生成具有高周期估计值的候选音调的混合高斯分布,选择具有高可能性的混合高斯分布,并且执行动态编程,使得语音信号的音调能够被精确估计 。 音调估计方法包括:计算通过将语音信号的帧乘以窗口信号而获得的加窗信号的标准化自相关函数,并根据窗口化信号的归一化自相关函数的峰值确定候选音高,内插 确定候选间距和表示周期长度的周期估计值,为内插周期估计值大于第一阈值的每个帧产生候选间距的高斯分布,将位于距离处的高斯分布混合 小于第二阈值以产生混合高斯分布,并且选择可能性超过第三阈值的混合高斯分布中的至少一个,并且基于候选音调执行帧估计每帧的音高的动态规划 的每个框架和se 选择混合高斯分布。

    Apparatus, method, and medium for distinguishing vocal sound from other sounds
    2.
    发明申请
    Apparatus, method, and medium for distinguishing vocal sound from other sounds 失效
    用于区分声音与其他声音的装置,方法和介质

    公开(公告)号:US20050187761A1

    公开(公告)日:2005-08-25

    申请号:US11051475

    申请日:2005-02-07

    CPC分类号: G10L25/93

    摘要: An apparatus, method, and medium for distinguishing a vocal sound. The apparatus includes: a framing unit dividing an input signal into frames, each frame having a predetermined length; a pitch extracting unit determining whether each frame is a voiced frame or an unvoiced frame and extracting a pitch contour from the voiced and unvoiced frames; a zero-cross rate calculator respectively calculating a zero-cross rate for each frame; a parameter calculator calculating parameters including a time length ratio of the voiced frame and the unvoiced frame determined by the pitch extracting unit, statistical information of the pitch contour, and spectral characteristics; and a classifier inputting the zero-cross rates and the parameters output from the parameter calculator and determining whether the input signal is a vocal sound.

    摘要翻译: 用于区分声音的装置,方法和介质。 该装置包括:成帧单元,将输入信号分成帧,每帧具有预定长度; 音调提取单元,确定每个帧是有声帧还是无声帧,并从有声和无声帧提取音调轮廓; 分别计算每帧的零交叉率的零交叉率计算器; 计算包括由音调提取单元确定的有声帧和无声帧的时间长度比的参数的计算参数,音调轮廓的统计信息和频谱特性; 以及输入从参数计算器输出的零交叉率和参数的分类器,并确定输入信号是否是声音。

    Formant tracking apparatus and formant tracking method
    3.
    发明申请
    Formant tracking apparatus and formant tracking method 失效
    前向跟踪装置和共振峰跟踪方法

    公开(公告)号:US20060111898A1

    公开(公告)日:2006-05-25

    申请号:US11247219

    申请日:2005-10-12

    IPC分类号: G10L11/04

    摘要: A formant tracking apparatus and a formant tracking method are provided. The formant tracking apparatus includes: a framing unit dividing an input voice signal into a plurality of frames; a linear prediction analyzing unit obtaining linear prediction coefficients for each frame; a segmentation unit segmenting each of the linear prediction coefficients into a plurality of segments; a formant candidate determining unit obtaining formant candidates by using the linear prediction coefficients, and summing the formant candidates for each segment to determine formant candidates for each segment; a formant number determining unit determining a number of tracking formants for each segment among the formant candidates satisfying a predetermined condition; and a tracking unit searching the tracking formants as many as the number of the tracking formants determined in the formant number determining unit among the formant candidates belonging to each segment.

    摘要翻译: 提供共振峰跟踪装置和共振峰跟踪方法。 共振峰跟踪装置包括:成帧单元,将输入的语音信号分成多个帧; 线性预测分析单元,用于获得每帧的线性预测系数; 分割单元,将每个所述线性预测系数分割为多个分段; 通过使用线性预测系数获得共振峰候选的共振峰候选确定单元,并且对每个分段对共振峰候选进行求和,以确定每个分段的共振峰候选; 共振峰数确定单元,确定满足预定条件的共振峰候补中每一段的跟踪共振峰数; 以及跟踪单元,搜索与属于每个段的共振峰候选者中的在共振峰数确定单元中确定的跟踪共轭体的数量一样多的跟踪共振峰。

    Apparatus, method, and medium for distinguishing vocal sound from other sounds
    4.
    发明授权
    Apparatus, method, and medium for distinguishing vocal sound from other sounds 失效
    用于区分声音与其他声音的装置,方法和介质

    公开(公告)号:US08078455B2

    公开(公告)日:2011-12-13

    申请号:US11051475

    申请日:2005-02-07

    IPC分类号: G10L11/06

    CPC分类号: G10L25/93

    摘要: An apparatus, method, and medium for distinguishing a vocal sound. The apparatus includes: a framing unit dividing an input signal into frames, each frame having a predetermined length; a pitch extracting unit determining whether each frame is a voiced frame or an unvoiced frame and extracting a pitch contour from the voiced and unvoiced frames; a zero-cross rate calculator respectively calculating a zero-cross rate for each frame; a parameter calculator calculating parameters including a time length ratio of the voiced frame and the unvoiced frame determined by the pitch extracting unit, statistical information of the pitch contour, and spectral characteristics; and a classifier inputting the zero-cross rates and the parameters output from the parameter calculator and determining whether the input signal is a vocal sound.

    摘要翻译: 用于区分声音的装置,方法和介质。 该装置包括:成帧单元,将输入信号分成帧,每帧具有预定长度; 音调提取单元,确定每个帧是有声帧还是无声帧,并从有声和无声帧提取音调轮廓; 分别计算每帧的零交叉率的零交叉率计算器; 计算包括由音调提取单元确定的有声帧和无声帧的时间长度比的参数,音调轮廓的统计信息和频谱特性的参数计算器; 以及输入从参数计算器输出的零交叉率和参数的分类器,并确定输入信号是否是声音。

    Formant tracking apparatus and formant tracking method
    5.
    发明授权
    Formant tracking apparatus and formant tracking method 失效
    前向跟踪装置和共振峰跟踪方法

    公开(公告)号:US07756703B2

    公开(公告)日:2010-07-13

    申请号:US11247219

    申请日:2005-10-12

    IPC分类号: G10L19/06 G10L19/00

    摘要: A formant tracking apparatus and a formant tracking method are provided. The formant tracking apparatus includes: a framing unit dividing an input voice signal into a plurality of frames; a linear prediction analyzing unit obtaining linear prediction coefficients for each frame; a segmentation unit segmenting each of the linear prediction coefficients into a plurality of segments; a formant candidate determining unit obtaining formant candidates by using the linear prediction coefficients, and summing the formant candidates for each segment to determine formant candidates for each segment; a formant number determining unit determining a number of tracking formants for each segment among the formant candidates satisfying a predetermined condition; and a tracking unit searching the tracking formants as many as the number of the tracking formants determined in the formant number determining unit among the formant candidates belonging to each segment.

    摘要翻译: 提供共振峰跟踪装置和共振峰跟踪方法。 共振峰跟踪装置包括:成帧单元,将输入的语音信号分成多个帧; 线性预测分析单元,用于获得每帧的线性预测系数; 分割单元,将每个所述线性预测系数分割为多个分段; 通过使用线性预测系数获得共振峰候选的共振峰候选确定单元,并且对每个分段对共振峰候选进行求和,以确定每个分段的共振峰候选; 共振峰数确定单元,确定满足预定条件的共振峰候补中每一段的跟踪共振峰数; 以及跟踪单元,搜索与属于每个段的共振峰候选者中的在共振峰数确定单元中确定的跟踪共轭体的数量一样多的跟踪共振峰。

    Multi-layered speech recognition apparatus and method
    6.
    发明授权
    Multi-layered speech recognition apparatus and method 有权
    多层语音识别装置及方法

    公开(公告)号:US08370159B2

    公开(公告)日:2013-02-05

    申请号:US11120983

    申请日:2005-05-04

    IPC分类号: G10L21/00

    CPC分类号: G10L15/08 G10L15/30 G10L15/32

    摘要: A multi-layered speech recognition apparatus and method, the apparatus includes a client checking whether the client recognizes the speech using a characteristic of speech to be recognized and recognizing the speech or transmitting the characteristic of the speech according to a checked result; and first through N-th servers, wherein the first server checks whether the first server recognizes the speech using the characteristic of the speech transmitted from the client, and recognizes the speech or transmits the characteristic according to a checked result, and wherein an n-th (2≦n≦N) server checks whether the n-th server recognizes the speech using the characteristic of the speech transmitted from an (n−1)-th server, and recognizes the speech or transmits the characteristic according to a checked result.

    摘要翻译: 一种多层语音识别装置和方法,所述装置包括:客户端,根据检查结果,使用要识别的语音特征识别语音,识别语音,或者发送语音的特征; 以及第一至第N服务器,其中第一服务器使用从客户端发送的语音的特征来检查第一服务器是否识别语音,并且根据检查结果识别语音或发送特性,并且其中, (2≦̸ n≦̸ N)服务器检查第n个服务器是否使用从第(n-1)个服务器发送的语音的特征识别语音,并且根据检查结果识别语音或发送特性 。

    Apparatus, method, and medium for detecting and discriminating impact sound
    7.
    发明授权
    Apparatus, method, and medium for detecting and discriminating impact sound 失效
    用于检测和辨别冲击声的装置,方法和介质

    公开(公告)号:US07234340B2

    公开(公告)日:2007-06-26

    申请号:US11050806

    申请日:2005-02-07

    IPC分类号: G01M13/00

    CPC分类号: G01H1/00

    摘要: An apparatus, method, and medium for detecting an impact sound and an apparatus, method, and medium for discriminating the impact sound using the same. The impact sound detecting apparatus includes: an onset detector separating an input signal of a frame unit into a low frequency signal and a high frequency signal, measuring powers of the separated signals, and detecting onsets by detecting changes in the measured powers; an event buffer buffering the powers measured by the onset detector and spectral data of the input signal; and an impact sound verifier determining whether each of the detected onsets is an impulse onset, and if each of the detected onsets is the impulse onset, detecting events starting from the impulse onsets by checking the powers stored in the event buffer and determining each of the detected events to be an impulse event if each of the detected onsets satisfies a predetermined condition.

    摘要翻译: 用于检测冲击声的装置,方法和介质,以及用于使用该冲击声识别冲击声的装置,方法和介质。 冲击声检测装置包括:起始检测器,将帧单元的输入信号分离为低频信号和高频信号,测量分离信号的功率,以及通过检测测量功率的变化来检测开始; 缓冲起始检测器测量的功率和输入信号的频谱数据的事件缓冲器; 以及影响声音验证器,其确定每个检测到的开头是否是脉冲开始,并且如果检测到的每一个都是脉冲开始,则通过检查存储在事件缓冲器中的功率来确定从脉冲起始开始的事件,并确定每个 检测到的事件如果检测到的每个检测到的事件满足预定条件,则成为脉冲事件。

    Apparatus, method, and medium for detecting and discriminating impact sound
    8.
    发明申请
    Apparatus, method, and medium for detecting and discriminating impact sound 失效
    用于检测和辨别冲击声的装置,方法和介质

    公开(公告)号:US20050199064A1

    公开(公告)日:2005-09-15

    申请号:US11050806

    申请日:2005-02-07

    CPC分类号: G01H1/00

    摘要: An apparatus, method, and medium for detecting an impact sound and an apparatus, method, and medium for discriminating the impact sound using the same. The impact sound detecting apparatus includes: an onset detector separating an input signal of a frame unit into a low frequency signal and a high frequency signal, measuring powers of the separated signals, and detecting onsets by detecting changes in the measured powers; an event buffer buffering the powers measured by the onset detector and spectral data of the input signal; and an impact sound verifier determining whether each of the detected onsets is an impulse onset, and if each of the detected onsets is the impulse onset, detecting events starting from the impulse onsets by checking the powers stored in the event buffer and determining each of the detected events to be an impulse event if each of the detected onsets satisfies a predetermined condition.

    摘要翻译: 用于检测冲击声的装置,方法和介质,以及用于使用该冲击声识别冲击声的装置,方法和介质。 冲击声检测装置包括:起始检测器,将帧单元的输入信号分离为低频信号和高频信号,测量分离信号的功率,以及通过检测测量功率的变化来检测开始; 缓冲起始检测器测量的功率和输入信号的频谱数据的事件缓冲器; 以及影响声音验证器,其确定每个检测到的开头是否是脉冲开始,并且如果检测到的每一个都是脉冲开始,则通过检查存储在事件缓冲器中的功率来确定从脉冲起始开始的事件,并确定每个 检测到的事件如果检测到的每个检测到的事件满足预定条件,则成为脉冲事件。

    Multi-layered speech recognition apparatus and method
    9.
    发明申请
    Multi-layered speech recognition apparatus and method 有权
    多层语音识别装置及方法

    公开(公告)号:US20060080105A1

    公开(公告)日:2006-04-13

    申请号:US11120983

    申请日:2005-05-04

    IPC分类号: G10L21/00

    CPC分类号: G10L15/08 G10L15/30 G10L15/32

    摘要: A multi-layered speech recognition apparatus and method, the apparatus includes a client checking whether the client recognizes the speech using a characteristic of speech to be recognized and recognizing the speech or transmitting the characteristic of the speech according to a checked result; and first through N-th servers, wherein the first server checks whether the first server recognizes the speech using the characteristic of the speech transmitted from the client, and recognizes the speech or transmits the characteristic according to a checked result, and wherein an n-th (2≦n≦N) server checks whether the n-th server recognizes the speech using the characteristic of the speech transmitted from an (n−1)-th server, and recognizes the speech or transmits the characteristic according to a checked result.

    摘要翻译: 一种多层语音识别装置和方法,所述装置包括:客户端,根据检查结果,使用要识别的语音特征识别语音,识别语音,或者发送语音的特征; 以及第一至第N服务器,其中第一服务器使用从客户端发送的语音的特征来检查第一服务器是否识别语音,并且根据检查结果识别语音或发送特性,并且其中, 服务器检查第n个服务器是否使用从第(n-1)个服务器发送的语音的特征识别语音,并且识别语音或者根据 检查结果。

    Multi-layered speech recognition apparatus and method

    公开(公告)号:US08380517B2

    公开(公告)日:2013-02-19

    申请号:US13478656

    申请日:2012-05-23

    IPC分类号: G10L21/00

    CPC分类号: G10L15/08 G10L15/30 G10L15/32

    摘要: A multi-layered speech recognition apparatus and method, the apparatus includes a client checking whether the client recognizes the speech using a characteristic of speech to be recognized and recognizing the speech or transmitting the characteristic of the speech according to a checked result; and first through N-th servers, wherein the first server checks whether the first server recognizes the speech using the characteristic of the speech transmitted from the client, and recognizes the speech or transmits the characteristic according to a checked result, and wherein an n-th (2≦n≦N) server checks whether the n-th server recognizes the speech using the characteristic of the speech transmitted from an (n−1)-th server, and recognizes the speech or transmits the characteristic according to a checked result.