Bit rate reduction in audio encoders by exploiting inharmonicity effects and auditory temporal masking
    1.
    发明授权
    Bit rate reduction in audio encoders by exploiting inharmonicity effects and auditory temporal masking 失效
    音频编码器通过利用不协调效应和听觉时间屏蔽来降低比特率

    公开(公告)号:US07398204B2

    公开(公告)日:2008-07-08

    申请号:US10647320

    申请日:2003-08-26

    IPC分类号: G10L11/04 G10L21/00

    CPC分类号: G10L19/02 G10L19/032

    摘要: The present invention relates to a method for encoding an audio signal. In a first embodiment a model relating to temporal masking of sound provided to a human ear is provided. A temporal masking index is determined in dependence upon a received audio signal and the model using a forward and a backward masking function. Using a psychoacoustic model a masking threshold is determined in dependence upon the temporal masking index. Finally, the audio signal is encoded in dependence upon the masking threshold. The method has been implemented using the MPEG-1 psychoacoustic model 2. Semiformal listening test showed that using the method for encoding an audio signal according to the present invention the subjective high quality of the decoded compressed sounds has been maintained while the bit rate was reduced by approximately 10%. In a second embodiment, the inharmonic structure of audio signals is modeled and incorporated into the MPEG-1 psychoacoustic model 2. In the model, the relationship between the spectral components of the input audio signal is considered and an inharmonicity index is defined and incorporated into the MPEG-1 psychoacoustic model 2. Informal listening tests have shown that the bit rate required for transparent coding of inharmonic (multi-tonal) audio material can be reduced by 10% if the modified psychoacoustic model 2 is used in the MPEG 1 Layer II encoder.

    摘要翻译: 本发明涉及一种对音频信号进行编码的方法。 在第一实施例中,提供了与提供给人耳的声音的时间屏蔽有关的模型。 根据接收的音频信号和使用前向和后向屏蔽功能的模型来确定时间屏蔽索引。 使用心理声学模型,根据时间掩蔽指数确定掩蔽阈值。 最后,根据屏蔽阈值对音频信号进行编码。 该方法已经使用MPEG-1心理声学模型实现2.半形式听力测试表明,使用根据本发明的音频信号编码方法,解码的压缩声音的主观高质量已被维持,同时比特率降低 减少约10%。 在第二实施例中,音频信号的非谐结构被建模并且并入到MPEG-1心理声学模型2中。在该模型中,考虑了输入音频信号的频谱分量之间的关系,并且将不协调性指数定义并并入 MPEG-1心理声学模型2.非正式听力测试表明,如果在MPEG 1 Layer II中使用修改的心理声学模型2,则可以将非音调(多音调)音频素材的透明编码所需的比特率降低10% 编码器。

    Bit rate reduction in audio encoders by exploiting inharmonicity effects and auditory temporal masking
    2.
    发明申请
    Bit rate reduction in audio encoders by exploiting inharmonicity effects and auditory temporal masking 审中-公开
    音频编码器通过利用不协调效应和听觉时间屏蔽来降低比特率

    公开(公告)号:US20080221875A1

    公开(公告)日:2008-09-11

    申请号:US12153408

    申请日:2008-05-19

    IPC分类号: G10L19/00 G10L21/00

    CPC分类号: G10L19/02 G10L19/032

    摘要: The present invention relates to a method for encoding an audio signal. In a first embodiment a model relating to temporal masking of sound provided to a human ear is provided. A temporal masking index is determined in dependence upon a received audio signal and the model using a forward and a backward masking function. Using a psychoacoustic model a masking threshold is determined in dependence upon the temporal masking index. Finally, the audio signal is encoded in dependence upon the masking threshold. The method has been implemented using the MPEG-1 psychoacoustic model 2. Semiformal listening test showed that using the method for encoding an audio signal according to the present invention the subjective high quality of the decoded compressed sounds has been maintained while the bit rate was reduced by approximately 10%. In a second embodiment, the inharmonic structure of audio signals is modeled and incorporated into the MPEG-1 psychoacoustic model 2. In the model, the relationship between the spectral components of the input audio signal is considered and an inharmonicity index is defined and incorporated into the MPEG-1 psychoacoustic model 2. Informal listening tests have shown that the bit rate required for transparent coding of inharmonic (multi-tonal) audio material can be reduced by 10% if the modified psychoacoustic model 2 is used in the MPEG 1 Layer II encoder.

    摘要翻译: 本发明涉及一种对音频信号进行编码的方法。 在第一实施例中,提供了与提供给人耳的声音的时间屏蔽有关的模型。 根据接收的音频信号和使用前向和后向屏蔽功能的模型来确定时间屏蔽索引。 使用心理声学模型,根据时间掩蔽指数确定掩蔽阈值。 最后,根据屏蔽阈值对音频信号进行编码。 该方法已经使用MPEG-1心理声学模型2实现。 半形式听力测试表明,使用根据本发明的音频信号编码方法,解码压缩声音的主观高品质已被维持,同时比特率降低了约10%。 在第二实施例中,音频信号的非谐结构被建模并且并入到MPEG-1心理声学模型2中。 在该模型中,考虑了输入音频信号的频谱分量之间的关系,并定义了非均匀性指标并将其并入到MPEG-1心理声学模型2中。 非正式听力测试表明,如果在MPEG 1 Layer II编码器中使用经修改的心理声学模型2,则对于非音调(多音调)音频材料的透明编码所需的比特率可以降低10%。