Abstract:
At an audio encoder, cue codes are generated for one or more audio channels, wherein a combined cue code (e.g., a combined inter-channel correlation (ICC) code) is generated by combining two or more estimated cue codes, each estimated cue code estimated from a group of two or more channels. At an audio decoder, E transmitted audio channel(s) are decoded to generate C playback audio channels. Received cue codes include a combined cue code (e.g., a combined ICC code). One or more transmitted channel(s) are upmixed to generate one or more upmixed channels. One or more playback channels are synthesized by applying the cue codes to the one or more upmixed channels, wherein two or more derived cue codes are derived from the combined cue code, and each derived cue code is applied to generate two or more synthesized channels.
Abstract:
The following coding scenario is addressed: A number of audio source signals need to be transmitted or stored for the purpose of mixing wave field synthesis, multi-channel surround, or stereo signals after decoding the source signals. The proposed technique offers significant coding gain when jointly coding the source signals, compared to separately coding them, even when no redundancy is present between the source signals. This is possible by considering statistical properties of the source signals, the properties of mixing techniques, and spatial hearing. The sum of the source signals is transmitted plus the statistical properties of the source signals which mostly determine the perceptually important spatial cues of the final mixed audio channels. Source signals are recovered at the receiver such that their statistical properties approximate the corresponding properties of the original source signals. Subjective evaluations indicate that high audio quality is achieved by the proposed scheme.
Abstract:
An apparatus for providing a set of spatial cues associated with an upmix audio signal having more than two channels on the basis of a two-channel microphone signal has a signal analyzer and a spatial side information generator. The signal analyzer is configured to obtain a component energy information and a direction information on the basis of the two-channel microphone signal, such that the component energy information describes estimates of energies of a direct sound component of the two-channel microphone signal and of a diffuse sound component of the two-channel microphone signal, and such that the directional information describes an estimate of a direction from which the direct sound component of the two-channel microphone signal originates. The spatial side information generator is configured to map the component energy information and the direction information onto a spatial cue information describing the set of spatial cues associated with an upmix audio signal having more than two channels.
Abstract:
An exemplary embodiment of the invention can generate multiple output audio signals from multiple input audio signals, in which the number of output signals is equal to or higher than the number of input signals. The embodiment includes computing one or more independent sound subbands representing signal components which are independent between the input subbands; computing one or more localized direct sound subbands representing signal components which are contained in more than one of the input subbands and direction factors representing the ratios with which these signal components are contained in two or more input subbands; generating the output subband signals, where each output subband signal is a linear combination of the independent sound subbands and the localized direct sound subbands; and converting the output subband signals to time domain audio signals.
Abstract:
One or more attributes (e.g., pan, gain, etc.) associated with one or more objects (e.g., an instrument) of a stereo or multi-channel audio signal can be modified to provide remix capability. In some implementations, a method can include obtaining a first plural-channel audio signal having one or more objects; obtaining side information, at least some of which represents a relation between the first plural-channel audio signal and the one or more objects; obtaining a set of mix parameters; and generating a second plural-channel audio signal using the side information and the set of mix parameters.
Abstract:
A decoder (115) generates a multi channel audio signal, such as a surround sound signal, from a received first signal. The multi-channel signal comprises a second set of audio channels and the first signal comprises a first set of audio channels. The decoder (115) comprises a receiver (401) which receives the first signal. The receiver (401) is coupled to an estimate processor (405) which generates estimated parametric data for the second set of audio channels in response to characteristics of the first set of audio channels. The estimated parametric data relates characteristics of the second set of audio channels to characteristics of the first set of audio channels. The decoder (115) furthermore comprises a spatial audio decoder (403) which decodes the first signal in response to the estimated parametric data to generate the multi-channel signal comprising the second set of channels. The invention allows use of spatial audio decoding with signals that are not encoded by a spatial audio encoder.
Abstract:
A method of processing an audio signal is disclosed. The present invention includes receiving, by an audio processing apparatus, an input signal; receiving user gain input; generating a linear gain factor and a non-linear gain factor using the user gain input; modifying the non-linear gain factor using absolute threshold of hearing and power of the input signal to generate a modified non-linear gain factor; and, applying the linear gain factor and the modified non-linear gain factor to the audio signal.
Abstract:
Generic and specific C-to-E binaural cue coding (BCC) schemes are described, including those in which one or more of the input channels are transmitted as unmodified channels that are not downmixed at the BCC encoder and not upmixed at the BCC decoder. The specific BCC schemes described include 5-to-2, 6-to-5, 7-to-5, 6.1-to-5.1, 7.1-to-5.1, and 6.2-to-5.1, where “0.1” indicates a single low-frequency effects (LFE) channel and “0.2” indicates two LFE channels.
Abstract:
Generic and specific C-to-E binaural cue coding (BCC) schemes are described, including those in which one or more of the input channels are transmitted as unmodified channels that are not downmixed at the BCC encoder and not upmixed at the BCC decoder. The specific BCC schemes described include 5-to-2, 6-to-5, 7-to-5, 6.1-to-5.1, 7.1-to-5.1, and 6.2-to-5.1, where “0.1” indicates a single low-frequency effects (LFE) channel and “0.2” indicates two LFE channels.
Abstract:
The directionality of microphones is often not high enough, resulting in compromised music recording. Beamforming for getting a signal with a higher directional response is limited due to spatial aliasing, dependence of beamwidth on frequency, and a requirement of a high number of microphones. Adaptive beamforming is suitable for applications where the only aim is to optimize signal to noise ratio, but not suitable for applications where a time-invariant beamshape is required. The invention addresses these limitations, using adaptive signal processing applied to a plurality of microphone signals or other signals with an associated directionality.A method is therefore proposed to generate an output audio signal y from two or more input audio signals (x1, x2, . . . ), this method comprising the steps of: define one input signal as reference signal for each of the other input signals compute gain factors related to how much of the input signal is contained in the reference signal adjust the gain factors using a limiting function compute the output signal by subtracting from the reference signal the other input signals multiplied by the corresponding adjusted gain factors