Abstract:
An apparatus for processing an audio signal and method thereof are disclosed, by which a local dynamic range of an audio signal can be adaptively normalized as well as a maximum dynamic range of the audio signal. The present invention includes receiving, by an audio processing apparatus, a signal, and feedback information estimated based on a normalizing gain; generating a noise estimation based on the signal; computing a gain filter for noise canceling, based on the noise estimation and the signal; and, obtaining a restricted gain filter by applying the feedback information to the gain filter.
Abstract:
The directionality of microphones is often not high enough, resulting in compromised music recording. Beamforming for getting a signal with a higher directional response is limited due to spatial aliasing, dependence of beamwidth on frequency, and a requirement of a high number of microphones. The invention proposes a method to generate an output audio signal y from two or more input audio signals (x1, x2, . . . ), this method comprising the steps of: define one input signal as reference signal for each of the other input signals compute gain factors related to how much of the input signal is contained in the reference signal adjust the gain factors using a limiting function compute the output signal by subtracting from the reference signal the other input signals multiplied by the corresponding adjusted gain factors.
Abstract:
Generic and specific C-to-E binaural cue coding (BCC) schemes are described, including those in which one or more of the input channels are transmitted as unmodified channels that are not downmixed at the BCC encoder and not upmixed at the BCC decoder. The specific BCC schemes described include 5-to-2, 6-to-5, 7-to-5, 6.1-to-5.1, 7.1-to-5.1, and 6.2-to-5.1, where “0.1” indicates a single low-frequency effects (LFE) channel and “0.2” indicates two LFE channels.
Abstract:
At an audio encoder, cue codes are generated for one or more audio channels, wherein a combined cue code (e.g., a combined inter-channel correlation (ICC) code) is generated by combining two or more estimated cue codes, each estimated cue code estimated from a group of two or more channels. At an audio decoder, E transmitted audio channel(s) are decoded to generate C playback audio channels. Received cue codes include a combined cue code (e.g., a combined ICC code). One or more transmitted channel(s) are upmixed to generate one or more upmixed channels. One or more playback channels are synthesized by applying the cue codes to the one or more upmixed channels, wherein two or more derived cue codes are derived from the combined cue code, and each derived cue code is applied to generate two or more synthesized channels.
Abstract:
A preferred embodiment of an apparatus for computing filter coefficients for an adaptive filter for filtering a microphone signal so as to suppress an echo due to a loudspeaker signal includes an extractor for extracting a stationary component signal or a non-stationary component signal from the loudspeaker signal or from a signal derived from the loudspeaker signal, and a computer for computing the filter coefficients for the adaptive filter on the basis of the extracted stationary component signal or the extracted non-stationary component signal.
Abstract:
A binaural cue coding scheme in which cue codes are derived from the transmitted audio signal. In one embodiment, an encoder downmixes C input channels to generate E transmitted channels, where C>E>1. A decoder derives cue codes from the transmitted channels and uses those cue codes to synthesize playback channels. For example, in one 5-to-2 BCC embodiment, the encoder downmixes a 5-channel surround signal to generate left and right channels of a stereo signal. The decoder derives stereo cues from the transmitted stereo signal, maps those stereo cues to surround cues, and applies the surround cues to the transmitted stereo channels to generate playback channels of a 5-channel synthesized surround signal.
Abstract:
A method of processing an audio signal is disclosed. The present invention comprises receiving a downmix signal, object information and preset information, generating downmix processing information using the object information and the preset information, processing the downmix signal using the downmix processing information, and generating multi-channel information using the object information and the preset information, wherein the preset information is extracted from a bitstream. Accordingly, a gain and panning of an object can be easily controlled without user's setting for each object using preset information set in advance. And, a gain and panning of an object can be controlled using preset information modified based on a selection made by a user.
Abstract:
At an audio encoder, cue codes are generated for one or more audio channels, wherein an envelope cue code is generated by characterizing a temporal envelope in an audio channel. At an audio decoder, E transmitted audio channel(s) are decoded to generate C playback audio channels, where C>E≧1. Received cue codes include an envelope cue code corresponding to a characterized temporal envelope of an audio channel corresponding to the transmitted channel(s). One or more transmitted channel(s) are upmixed to generate one or more upmixed channels. One or more playback channels are synthesized by applying the cue codes to the one or more upmixed channels, wherein the envelope cue code is applied to an upmixed channel or a synthesized signal to adjust a temporal envelope of the synthesized signal based on the characterized temporal envelope such that the adjusted temporal envelope substantially matches the characterized temporal envelope.
Abstract:
A method of processing an audio signal is disclosed. The present invention comprises receiving downmix signal including object signals, transforming the downmix signal per frequency band, determining a direction of an object signal from the transformed downmix signal, and determining blind information by estimating a level of the object signal corresponding to the direction. Accordingly, the present invention generates blind information in case of using an encoder incapable of generating object information, thereby enabling a gain and panning of object to be controlled using the blind information.
Abstract:
An apparatus for processing an audio signal and method thereof are disclosed, by which a local dynamic range of an audio signal can be adaptively normalized as well as a maximum dynamic range of the audio signal. The present invention includes receiving a signal, by an audio processing apparatus; computing a long-term power and a short-term power by estimating power of the signal; generating a slow gain based on the long-term power; generating a fast gain based on the short-term power; obtaining a final gain by combining the slow gain and the fast gain; and, modifying the signal using the final gain.