Pitch determination using speech classification and prior pitch estimation

发明授权

US06507814B1 Pitch determination using speech classification and prior pitch estimation 有权

标题翻译：使用语音分类和先前音调估计的音调确定

请登陆查看更多内容

专利标题： Pitch determination using speech classification and prior pitch estimation
专利标题（中）： 使用语音分类和先前音调估计的音调确定
申请号： US09154654

申请日： 1998-09-18
公开(公告)号： US06507814B1

公开(公告)日： 2003-01-14
发明人: Yang Gao
申请人： Yang Gao
主分类号： G10L1904
IPC分类号： G10L1904

Pitch determination using speech classification and prior pitch estimation

摘要：

A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. To achieve high quality in lower bit rate encoding modes, the speech encoder departs from the strict waveform matching criteria of regular CELP coders and strives to identify significant perceptual features of the input signal. To support lower bit rate encoding modes, a variety of techniques are applied many of which involve the classification of the input signal. For each bit rate mode selected, pluralities of fixed or innovation subcodebooks are selected for use in generating innovation vectors. The speech encoder also utilizes an adaptive weighting factor in the selection of a current pitch lag value from a plurality of pitch lag candidates. For example, if the speech encoder identifies an integer multiple timing relationship between any two pitch lag candidates, the pitch lag candidate with the smallest timing value is favored through adjustment of the weighting factor. Similarly, if a pitch lag candidate exhibits timing that corresponds to that of previous pitch lag values, the weighting factor is adjusted to favor that candidate.

摘要（中）：

多速率语音编解码器通过自适应地选择编码比特率模式以匹配通信信道限制来支持多种编码比特率模式。在较高的比特率编码模式中，通过CELP（码激励线性预测）和其他相关联的建模参数的语音的精确表示被生成用于更高质量的解码和再现。为了在低比特率编码模式下实现高质量，语音编码器脱离了常规CELP编码器的严格波形匹配标准，并努力识别输入信号的重要感知特征。为了支持较低比特率编码模式，应用了许多技术，其中许多技术涉及输入信号的分类。对于所选择的每个比特率模式，选择多个固定或创新子码本来用于产生创新向量。语音编码器还在从多个音调滞后候选中选择当前音调滞后值的同时利用自适应加权因子。例如，如果语音编码器识别任何两个音调滞后候选之间的整数倍定时关系，则通过调整加权因子，有利于具有最小定时值的音调滞后候选。类似地，如果音调滞后候选呈现对应于先前音调滞后值的定时，则调整加权因子以有利于该候选。

信息查询

Espacenet