摘要:
A two-channel phase-amplitude stereo encoding and decoding scheme enabling flexible and spatially accurate interactive 3-D audio reproduction via standard audio-only two-channel transmission. The encoding scheme allows associating a 2-D or 3-D positional localization to each of a plurality of sound sources by use of frequency independent inter-channel phase and amplitude differences. The decoder is based on frequency-domain spatial analysis of 2-D or 3-D directional cues in a two-channel stereo signal and re-synthesis of these cues using any preferred spatialization technique, thereby allowing faithful reproduction of positional audio cues and reverberation or ambient cues over arbitrary multi-channel loudspeaker reproduction formats or over headphones, while preserving source separation despite the intermediate encoding over only two audio channels.
摘要:
A two-channel phase-amplitude stereo encoding and decoding scheme enabling flexible and spatially accurate interactive 3-D audio reproduction via standard audio-only two-channel transmission. The encoding scheme allows associating a 2-D or 3-D positional localization to each of a plurality of sound sources by use of frequency independent inter-channel phase and amplitude differences. The decoder is based on frequency-domain spatial analysis of 2-D or 3-D directional cues in a two-channel stereo signal and re-synthesis of these cues using any preferred spatialization technique, thereby allowing faithful reproduction of positional audio cues and reverberation or ambient cues over arbitrary multi-channel loudspeaker reproduction formats or over headphones, while preserving source separation despite the intermediate encoding over only two audio channels.
摘要:
A method of ambience extraction includes analyzing an input signal to determine the time-dependent and frequency-dependent amount of ambience in the input signal, wherein the amount of ambience is determined based on a signal model and correlation quantities computed from the input signals and wherein the ambience is extracted using a multiplicative time-frequency mask. Another method of ambience extraction includes compensating a bias in the estimation of a short-term cross-correlation coefficient. In addition, systems having various modules for implementing the above methods are disclosed.
摘要:
A digital signal is processed by splitting it into at least two frequency subbands and the two subband signals are downsampled. A filter is applied in at least one of the subband signals. At least one of the phase and magnitude of the subband filtered signals is matched in the transition frequency band between the two subbands.
摘要:
An audio signal is processed in the frequency domain to convert an input signal format to an output signal format. That is, a multichannel audio signal intended for playback over a predefined speaker layout can be formatted to achieve spatial reproduction over a different layout comprising a different number of speakers.
摘要:
A frequency-domain method for format conversion or reproduction of 2-channel or multi-channel audio signals such as recordings is described. The reproduction is based on spatial analysis of directional cues in the input audio signal and conversion of these cues into audio output signal cues for two or more channels in the frequency domain.
摘要:
A method of ambience extraction includes analyzing an input signal to determine the time-dependent and frequency-dependent amount of ambience in the input signal, wherein the amount of ambience is determined based on a signal model and correlation quantities computed from the input signals and wherein the ambience is extracted using a multiplicative time-frequency mask. Another method of ambience extraction includes compensating a bias in the estimation of a short-term cross-correlation coefficient. In addition, systems having various modules for implementing the above methods are disclosed.
摘要:
A frequency domain method for phase-amplitude matrixed surround decoding of 2-channel stereo recordings and soundtracks, based on spatial analysis of 2-D or 3-D directional cues in the recording and re-synthesis of these cues for reproduction on any headphone or loudspeaker playback system.
摘要:
An input signal is converted to a feature-space representation. The feature-space representation is projected onto a discriminant subspace using a linear discriminant analysis transform to enhance the separation of feature clusters. Dynamic programming is used to find global changes to derive optimal cluster boundaries. The cluster boundaries are used to identify the segments of the audio signal.
摘要:
A stereo audio signal is processed to determine primary and ambient components by transforming the signal into vectors corresponding to subband signals, and decomposing the left and right channel vectors into ambient and primary components by matrix and vector operations. Principal component analysis is used to determine a primary component unit vector, and ambience components are determined according to a correlation-based cross-fade or an orthogonal basis derivation.