-
公开(公告)号:US5682463A
公开(公告)日:1997-10-28
申请号:US384049
申请日:1995-02-06
CPC分类号: H04B1/665
摘要: A new technique for the determination of the masking effect of an audio signal is employed to provide transparent compression of an audio signal at greatly reduced bit rates. The new technique employs the results of recent research into the psycho-physics of noise masking in the human auditory system. This research suggests that noise masking is a function of the uncertainty in loudness as perceived by the brain. Measures of loudness uncertainty are employed to determine the degree to which audio signals are "tone-like" (or "noise-like"). The degree of tone-likeness, referred to as "tonality," is used to determine masking thresholds for use in the compression of audio signals. Tonality, computed in accordance with the present invention, is used in conventional and new arrangements to achieve compression of audio signals.
摘要翻译: 采用用于确定音频信号的掩蔽效果的新技术来以极大降低的比特率提供音频信号的透明压缩。 新技术采用最近对人类听觉系统噪声屏蔽心理物理学研究的结果。 这项研究表明,噪声屏蔽是大脑感知到的响度不确定度的函数。 使用响度不确定度的度量来确定音频信号是“色调”(或“类似噪声”)的程度。 被称为“音调”的音调程度用于确定用于音频信号压缩的掩蔽阈值。 根据本发明计算的音调用于实现音频信号的压缩的常规和新配置中。
-
2.
公开(公告)号:US5699479A
公开(公告)日:1997-12-16
申请号:US384097
申请日:1995-02-06
CPC分类号: H04B1/665
摘要: A new technique for the determination of the masking effect of an audio signal is employed to provide transparent compression of an audio signal at greatly reduced bit rates. The new technique employs the results of recent research into the psycho-physics of noise masking in the human auditory system. This research suggests that noise masking is a function of the uncertainty in loudness as perceived by the brain. Measures of loudness uncertainty are employed to form noise masking thresholds for use in the compression of audio signals. These measures are employed in an illustrative subband, analysis-by-synthesis framework. In accordance with the illustrative embodiment, provisional encodings of the audio signal are performed to determine the encoding which achieves a loudness differential, between the original and coded audio signal, which is less than (but not too far below) the loudness uncertainty.
摘要翻译: 采用用于确定音频信号的掩蔽效果的新技术来以极大降低的比特率提供音频信号的透明压缩。 新技术采用最近对人类听觉系统噪声屏蔽心理物理学研究的结果。 这项研究表明,噪声屏蔽是大脑感知到的响度不确定度的函数。 使用响度不确定度的测量来形成用于压缩音频信号的噪声屏蔽阈值。 这些措施用于说明性的子带,综合分析框架。 根据说明性实施例,执行音频信号的临时编码以确定在原始和编码音频信号之间实现响度差的编码,其小于(但不是远低于)响度不确定性。
-
公开(公告)号:US6091773A
公开(公告)日:2000-07-18
申请号:US968644
申请日:1997-11-12
申请人: Mark R. Sydorenko
发明人: Mark R. Sydorenko
CPC分类号: G06T9/002 , H04N19/124 , H04N19/14 , H04N19/152 , H04N19/154 , H04N19/60 , H04N19/80
摘要: A method and apparatus for measuring the "perceptual distance" between an approximate, reconstructed representation of a sensory signal (such as an audio or video signal) and the original sensory signal is provided. The perceptual distance in this context is a direct quantitative measure of the likelihood that a human observer can distinguish the original audio or video signal from the reconstructed approximation to the original audio or video signal. The method described herein applies to noisy compression techniques; the method provides the ability to predict the likelihood that the reconstructed noisy representation of the original signal will be distinguishable by a human observer from the original input representation. The method can be used to allocate bits in audio and video compression algorithms such that the signal reconstructed from compressed representation is perceptually similar to the original input signal when judged by a human observer. The method is based on a theory of the neurophysiological limitations of human sensory perception. Specifically, a "neural encoding model" (NEM) summarizes the manner in which sensory signals are represented in the human brain. The NEM is analyzed in the context of detection theory which provides a mathematical framework for statistically quantifying the detectability of differences in the neural representation arising from differences in sensory input. This NEM approach has been validated by demonstrating its ability to predict a variety of published psychoacoustic data, including masking and many other phenomenon.
摘要翻译: 提供了一种用于测量感觉信号(例如音频或视频信号)的近似重建表示与原始感觉信号之间的“感知距离”的方法和装置。 在这种情况下的感知距离是对人类观察者可以将原始音频或视频信号与重建的近似值区别为原始音频或视频信号的可能性的直接定量测量。 本文描述的方法适用于噪声压缩技术; 该方法提供了预测原始信号的重建噪声表示将被人类观察者与原始输入表示可区分的可能性的能力。 该方法可以用于在音频和视频压缩算法中分配比特,使得当由人类观察者判断时,从压缩表示重建的信号在听觉上类似于原始输入信号。 该方法基于人类感觉知觉的神经生理学限制理论。 具体地说,“神经编码模型”(NEM)总结了感觉信号在人脑中的表现方式。 在检测理论的上下文中分析NEM,其提供用于统计量化由感觉输入的差异引起的神经表示中的差异的可检测性的数学框架。 这种NEM方法已经通过证明其预测各种已发表的心理声学数据的能力得到验证,包括掩蔽和许多其他现象。
-
-