摘要:
In order to analyze an information signal, a significant short-time spectrum is extracted from the information signal, the means for extracting being configured to extract such short-time spectra which come closer to a specific characteristic than other short-time spectra of the information signal. The short-time spectra extracted are then decomposed into component signals using ICA analysis, a component signal spectrum representing a profile spectrum of a tone source which generates a tone corresponding to the characteristic sought for. From a sequence of short-time spectra of the information signal and from the profile spectra determined, an amplitude envelope is eventually calculated for each profile spectrum, the amplitude envelope indicating how a profile spectrum of a tone source all in all changes over time. The profile spectra and all the amplitude envelopes associated therewith provide a description of the information signal which may be evaluated further, for example for transcription purposes in the case of a music signal.
摘要:
The envelope of a decorrelated signal derived from an original signal can be shaped without introducing additional distortion, when a spectral flattener is used to spectrally flatten the spectrum of the decorrelated signal and the original signal prior to using the flattened spectra for deriving a gain factor describing the energy distribution between the flattened spectra, and when the so derived gain factor is used by an envelope shaper to timely shape the envelope of the decorrelated signal.
摘要:
A significant short-time spectrum is extracted from an information signal, the means for extracting being configured to extract such short-time spectra which come closer to a specific characteristic than others. The short-time spectra extracted are then decomposed into component signals using ICA analysis, a component signal spectrum representing a profile spectrum of a tone source which generates a tone corresponding to the characteristic sought. From a sequence of short-time spectra of the information signal and from the profile spectra determined, an amplitude envelope is calculated for each profile spectrum to indicate how a tone source profile spectrum changes over time. The profile spectra and all the amplitude envelopes associated therewith provide a description of the information signal which may be evaluated further, for example for transcription purposes in the case of a music signal.
摘要:
The apparatus for constructing a multi-channel output signal using an input signal and parametric side information, the input signal including the first input channel and the second input channel derived from an original multi-channel signal, and the parametric side information describing interrelations between channels of the multi-channel original signal uses base channels for synthesizing first and second output channels on one side of an assumed listener position, which are different from each other. The base channels are different from each other because of a coherence measure. Coherence between the base channels (for example the left and the left surround reconstructed channel) is reduced by calculating a base channel for one of those channels by a combination of the input channels, the combination being determined by the coherence measure. Thus, a high subjective quality of the reconstruction can be obtained because of an approximated original front/back coherence.
摘要:
A method for detecting a transient in a discrete-time audio signal is performed completely in the time domain and includes the step of segmenting the discrete-time audio signal so as to generate consecutive segments of the same length with unfiltered discrete-time audio signals xs(T−1). The discrete-time audio signal in a current segment is subsequently filtered. Then either the energy of the filtered discrete-time audio signal in the current segment can be compared with the energy of the filtered discrete-time audio signal in a preceding segment or a current relationship between the energy of the filtered discrete-time audio signal in the current segment and the energy of the unfiltered discrete-time audio signal in the current segment can be formed and this current relationship compared with a preceding corresponding relationship. On the basis of the one and/or the other of these comparisons it is detected whether a transient is present in the discrete-time audio signal.
摘要:
An integer transform, which provides integer output values, carries out the TDAC function of a MDCT in the time domain before the forward transform. In overlapping windows, this results in a Givens rotation which may be represented by lifting matrices, wherein time-discrete sampled values of an audio signal may at first be summed up on a pair-wise basis to build a vector so as to be sequentially provided with a lifting matrix. After each multiplication of a vector by a lifting matrix, a rounding step is carried out such that, on the output-side, only integers will result. By transforming the windowed integer sampled value with an integer transform, a spectral representation with integer spectral values may be obtained. The inverse mapping with an inverse rotation matrix and corresponding inverse lifting matrices results in an exact reconstruction.
摘要:
For producing a fingerprint of an audio signal, use is made of information defining a plurality of predetermined fingerprint modi, all of the fingerprint modi relating to the same type of fingerprint, the fingerprint modi, however, providing different fingerprints differing from each other with regard to their data volume, on the one hand, and to their characterizing strength for characterizing the audio signal, on the other hand, the fingerprint modi being pre-determined such that a fingerprint in accordance with a fingerprint modus having a first characterizing strength is convertible to a fingerprint in accordance with a fingerprint modus having a second characterizing strength, without using the audio signal. A predetermined fingerprint modus of the plurality of predetermined fingerprint modi is set and subsequently used for computing a fingerprint using the audio signal. The convertibility feature of the fingerprints having been produced by the different fingerprint modi enables setting a flexible compromise between the data volume and the characterizing strength for certain applications without having to re-generate a fingerprint database with each change of the fingerprint modus. Fingerprint representations scaled with regard to time or frequency may readily be converted to a different fingerprint modus.
摘要:
Prior to embedding a watermark in an audio signal, a spectral representation of the audio signal and a spectral representation of the watermark signal are determined. The spectral representation of the watermark signal is then processed on the basis of a psychoacoustic masking threshold of the audio signal. The processed watermark signal is combined with the audio signal to obtain an audio signal bearing a watermark. The spectral representation of the watermark signal is processed iteratively as follows: first a predetermined watermark initial value is selected, then the interference introduced into the spectral representation of the audio signal after a quantization of the spectral representation of the audio signal is determined and then, if the interference introduced by the watermark initial value exceeds the predetermined interference threshold, the watermark initial value is modified progressively until the resulting interference introduced into the spectral representation of the audio signal after quantization is less than or equal to the predetermined interference threshold. The modified watermark initial value at the end of the iteration is used as the processed watermark signal to be combined with the audio signal. As a result it is no longer possible for a watermark to be quantized out. Instead, full control over the energy of the watermark is achieved. A watermark can therefore be embedded in an audio signal to provide either the best possible degree of watermark detectability or the best possible audio quality.
摘要:
Parameters being a measure for a characteristic of a channel or of a pair of channels, wherein the parameter is a measure for a characteristic of the channel or of the pair of channels with respect to another channel of a multi-channel signal can be quantized more efficiently using a quantization rule that is generated based on a relation of an energy measure of the channel or the pair of channels and an energy measure of the multi-channel signal. With generation of the quantization rule taking into account a psycho acoustic approach, the size of an encoded representation of the multi-channel signal can be decreased by coarser quantization without significantly disturbing the perceptual quality of the multi-channel signal when reconstructed from the encoded representation.
摘要:
An apparatus and a method for generating a multi-channel synthesizer control signal, a multi-channel synthesizer, a method of generating an output signal from an input signal and a machine-readable storage medium are provided. On an encoder-side, a multi-channel input signal is analyzed for obtaining smoothing control information, which is to be used by a decoder-side multi-channel synthesis for smoothing quantized transmitted parameters or values derived from the quantized transmitted parameters for providing an improved subjective audio quality in particular for slowly moving point sources and rapidly moving point sources having tonal material such as fast moving sinusoids.