摘要:
An audio encoder has a first information sink oriented encoding branch such as a spectral domain encoding branch, a second information source or SNR oriented encoding branch such as an LPC-domain encoding branch, and a switch for switching between the first encoding branch and the second encoding branch, wherein the second encoding branch has a converter into a specific domain different from the spectral domain such as an LPC analysis stage generating an excitation signal, and wherein the second encoding branch furthermore has a specific domain coding branch such as LPC domain processing branch, and a specific spectral domain coding branch such as LPC spectral domain processing branch, and an additional switch for switching between the specific domain coding branch and the specific spectral domain coding branch. An audio decoder has a first domain decoder such as a spectral domain decoding branch, a second domain decoder such as an LPC domain decoding branch for decoding a signal such as an excitation signal in the second domain, and a third domain decoder such as an LPC-spectral decoder branch and two cascaded switches for switching between the decoders.
摘要:
In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a decoder, which, in case of a low level decoder only decodes the first and second downmix channels or, in case of a high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information. Since the channel side information only occupy a low number of bits, and since the decoder does not use dematrixing, an efficient and high quality multi-channel extension for stereo players and enhanced multi-channel players is obtained.
摘要:
An apparatus for generating a synthesis audio signal using a patching control signal has a first converter, a spectral domain patch generator, a high frequency reconstruction manipulator and a combiner. The first converter is configured for converting a time portion of an audio signal into a spectral representation. The spectral domain patch generator is configured for performing a plurality of different spectral domain patching algorithms, wherein each patching algorithm generates a modified spectral representation having spectral components in an upper frequency band derived from corresponding spectral components in a core frequency band of the audio signal. The spectral domain patch generator is furthermore configured to select a first spectral domain patching algorithm from the plurality of patching algorithms for a first time portion and a second spectral domain patching algorithm from the plurality of patching algorithm for a second different time portion in accordance with the patching control signal to obtain the modified spectral representation.
摘要:
An audio signal decoder for providing an upmix signal representation in dependence on a downmix signal representation and an object-related parametric information includes an object separator configured to decompose the downmix signal representation, to provide a first audio information describing a first set of one or more audio objects of a first audio object type and a second audio information describing a second set of one or more audio objects of a second audio object type, in dependence on the downmix signal representation and using at least a part of the object-related parametric information.
摘要:
A method for decoding a multi-audio-object signal having audio signals of first and second types encoded therein, the multi-audio-object signal having a downmix signal and side information having level information of the audio signals of the first and second types in a first predetermined time/frequency resolution, the method including computing a prediction coefficient matrix C based on the level information; and up-mixing the downmix signal based on the prediction coefficients to obtain a first and/or a second up-mix audio signal approximating the audio signals of the first and second types, respectively, wherein up-mixing yields the first and/or second up-mix signals S1 and S2 from the downmix signal d according to a computation representable by ( S 1 S 2 ) = D - 1 { ( 1 C ) d + H } , with “1” denoting—depending on the number of channels of d—a scalar, or an identity matrix, and D−1 being a matrix uniquely determined by a downmix prescription according to which the audio signals of the first and second types are downmixed into the downmix signal, and which is also included by the side information, and H being a term independent from d.
摘要:
In the transition into the logarithmic range, not the entire bit width of the result linearly dependent upon the square of the value must be considered. Rather, it is possible to scale the result of a value with x bits such that a representation with less than x bits of the result is sufficient to receive the logarithmic representation based thereon. The effect of the scaling factor on the resulting logarithmic representation may be compensated for by adding or subtracting a correction value received by the logarithm function applied to the scaling factor to or from the scaled logarithmic representation without any loss of dynamics. This way, a method and an apparatus for creating a representation of a result linearly dependent upon a square of a value are provided so that the calculation is simple and/or possible with little hardware expenditure.
摘要:
A device for generating a binaural signal based on a multi-channel signal representing a plurality of channels and intended for reproduction by a speaker configuration having a virtual sound source position associated to each channel, is described. It includes a correlation reducer for differently processing, and thereby reducing a correlation between, at least one of a left and a right channel of the plurality of channels, a front and a rear channel of the plurality of channels, and a center and a non-center channel of the plurality of channels, in order to obtain an inter-similarity reduced set of channels; a plurality of directional filters, a first mixer for mixing outputs of the directional filters modeling the acoustic transmission to the first ear canal of the listener, and a second mixer for mixing outputs of the directional filters modeling the acoustic transmission to the second ear canal of the listener. According to another aspect, a center level reduction for forming the downmix for a room processor is performed. According to even another aspect, an inter-similarity decreasing set of head-related transfer functions is formed.
摘要:
An audio encoder has a common preprocessing stage, an information sink based encoding branch such as spectral domain encoding branch, a information source based encoding branch such as an LPC-domain encoding branch and a switch for switching between these branches at inputs into these branches or outputs of these branches controlled by a decision stage. An audio decoder has a spectral domain decoding branch, an LPC-domain decoding branch, one or more switches for switching between the branches and a common post-processing stage for post-processing a time-domain audio signal for obtaining a post-processed audio signal.
摘要:
For determining an estimate of a need for information units for encoding a signal, a measure for the distribution of the energy in the frequency band is taken into account in addition to the admissible interference for a frequency band and an energy of the frequency band. With this, a better estimate of the need for information units is obtained, so that coding can be done more efficiently and more accurately.
摘要:
A method for detecting a transient in a discrete-time audio signal is performed completely in the time domain and includes the step of segmenting the discrete-time audio signal so as to generate consecutive segments of the same length with unfiltered discrete-time audio signals xs(T−1). The discrete-time audio signal in a current segment is subsequently filtered. Then either the energy of the filtered discrete-time audio signal in the current segment can be compared with the energy of the filtered discrete-time audio signal in a preceding segment or a current relationship between the energy of the filtered discrete-time audio signal in the current segment and the energy of the unfiltered discrete-time audio signal in the current segment can be formed and this current relationship compared with a preceding corresponding relationship. On the basis of the one and/or the other of these comparisons it is detected whether a transient is present in the discrete-time audio signal.