摘要:
The invention makes it possible to combine a scaleable audio coder with TNS technology. According to the inventive method for encoding time signals (x1) sampled in a first sampling rate, second time signals (x2) with a sampling rate smaller than the first sampling rate are generated (12). The second time signals (x2) are then encoded (14) according to a first coding algorithm, and written into a bit stream (xAUS) (16). The encoded second time signals (x2c) are then decoded (14) again and are transformed (23, 24) into the frequency range, as are the first time signals. TNS prediction coefficients are then calculated (25) from a spectral representation of the first time signals (X1). The transformed output signal (X2cd) of the coder/decoder (14) with the first coding algorithm and the spectral representation (X1) of the first time signal are subjected to a prediction of the frequency (27) in order to obtain spectral residual values for both signals using the prediction coefficients calculated on the basis of the first time signals alone. These two signals are evaluated against each other (26, 28). The evaluated spectral residual values (Xb) are then encoded by means of a second coding algorithm in order to obtain coded evaluated spectral residual values (Xcb). These evaluated spectral residual values are written into the bit stream (xAUS) in addition to side information with the prediction coefficients.
摘要:
The invention relates to a method for embedding a watermark in an audio signal, according to which a spectral representation of the audio signal and a spectral representation of the watermark signal are first determined (14, 16). The spectral representation of the watermark signal is then processed, based on a psychoacoustic masking threshold (24) of the audio signal (22). The processed watermark signal is combined with the audio signal (18). The spectral representation of the watermark signal is processed iteratively in the following way: a predetermined watermark initial value (26) is first selected; the interference that has been introduced into the spectral representation by the predetermined watermark initial value after a quantization of the spectral representation of the audio signal, is determined (28). The watermark initial value is then modified until said interference introduced into the spectral representation of the audio signal by a modified watermark initial value after quantization, is less than or equal to the predetermined interference threshold (32).
摘要:
According to the inventive method for determining a coding block raster on which a decoded signal is based, a segment of the decoded signal is picked out first (1), this section beginning at a certain output sampling value of the decoded signal. Said segment is then converted into a spectral representation (12), whereupon said spectral representation is evaluated in relation to a predetermined criterion (13) in order to obtain an evaluation result for the segment. This procedure is repeated for a plurality of different segments beginning at different output scanning values, in order to obtain a plurality of evaluation results. Finally, said plurality of evaluation results is searched (14) in order to establish the evaluation result that has an extreme value compared to the other evaluation results, in such a way that it can be assumed that the segment to which this evaluation result is allocated matches the coding block raster on which the decoded signal is based. According to the invention, this method can be used to determine the coding block raster for any decoded signal that has no explicit information about its coding block raster.
摘要:
The invention relates to a device (10) for producing an encoded data stream which represents an audio and/or video signal. Said device comprises an encoder (16) for encoding an input signal (12) to produce a data stream of a defined data stream syntax as the output signal. Said device further comprises an encryption device (18) which is coupled to the encoder (16) to influence encoding-related data (20a) and/or the output signal (20b) of the encoder in an unequivocally reversible manner on the basis of a code in such a manner that the produced encoded data stream contains useful information that differs from the useful information of a data stream that would be produced by the device without the presence of the encryption device and that the produced encoded data stream has the defined data stream syntax. The invention thus provides a flexible data stream encryption according to which the degree of encryption can be freely selected in such a manner that the user of a decoder who does not possess the code still has a rough idea of the audio and/or video signal that might cause him/her to buy the code to hear or view the audio and/or video signal in its full quality. The encoder-specific encryption and decryption concept can be implemented into already existing encoders/decoders with little effort.
摘要:
The invention relates to a method for detecting a transient in a discrete-time audiosignal (x(k)) which is carried out entirely in the time domain. Said method comprises a step in which the discrete-time audiosignal is segmented so as generate consecutive segments of identical length with unfiltered discrete-time audiosignals (xs(T), xs(T-1), xs(T-2), ...), after which the discrete-time audiosignal (xs(T)) in a current segment is filtered. Thereafter there are two options: either the energy (Ef(T)) of the filtered discrete-time audiosignal (Ys(T)) in the current segment can be compared with the energy (Ef(T-1)) of the filtered discrete-time audiosignal (Ys(T-1)) in a preceding segment, or a current relationship can be generated between the energy (Ef(T)) of the filtered discrete-time audiosignal (Ys(T)) in the current segment and the energy (Eu(T)) of the unfiltered discrete-time audiosignal (Xs(T)) in the current segment and said current relationship compared with a corresponding preceding relationship. On the basis of the one comparison and/or the other comparison it is determined whether a transient is present in the discrete-time audiosignal.
摘要:
The invention relates to a method for masking defects in a stream of audio data. Defects in a stream of audio data which was previously intact are detected and a spectral energy of a sub-group of the intact audio data is then calculated. After a model has been created for replacement data, based on the spectral energy calculated for the sub-group of intact audio data, replacement data is generated for defective or absent audio data, said replacement data corresponding to the sub-group, based on the model.
摘要:
The invention relates to a method for characterising a signal representing an audio content. A measure is determined (12) for a tonality of the signal, whereupon a statement is made (16) about the audio content of the signal on the basis of the measure determined for the tonality of the signal. The measure for the tonality is derived from a quotient which has an average sum value in the numerator, of the spectral constituents of the signal which are exponentiated with a first power (x), and has an average sum value in the denominator, of spectral constituents which are exponientiated with a second power (y), the first and second powers differing from each other. The measure for the tonality of the signal for the content analysis is robust in relation to a signal distortion, e.g. by means of MP3-coding, and has a high correlation with the content of the signal analysed.
摘要:
A device for the analysis of an audio signal with regard to the rhythm information in the audio signal using an auto-correlation function, comprises a filter bank for splitting the audio signal into at least two partial-band signals. The partial band signals are analysed for periodicity by means of an auto-correlation function (106a), in order to obtain raw rhythm information for the at least two partial-band signals. The raw rhythm information is processed (121), to give processed raw rhythm information for the partial-band signal, in order to reduce or eliminate the ambiguity of the auto-correlation function for periodic signals. The rhythm information for the audio signal is determined (122) on the basis of the processed rhythm raw information. Auto-correlation function ambiguity is eliminated at the point of appearance, or rhythm components at double beat, not normally produced by an auto-correlation function processing are added by means of the auto-correlation function processing of partial bands, such that a more robust determination of the rhythm information in the audio signal results.
摘要:
The invention relates to a method for characterising a signal representing an audio content. A quantity is determined (12) for a tonality of the signal, whereupon information relating to the audio content of the signal is obtained (16) on the basis of the quantity for the tonality of the signal. Said quantity for the tonality of the signal, used to analyse the content, is stable in relation to a signal distortion, e.g. resulting from MP3-coding, and has a high correlation with the content of the signal examined.
摘要:
The invention relates to a device for processing a stereo audio signal comprising a first channel (L) and a second channel (R). The stereo signal is analysed (12) for obtaining a measure for a bit quantity, whereby said quantity is required by a coder for coding the stereo audio signal using a coding algorithm. The first and the second channel are subsequently modified (14) when the measure for the bit quantity is greater than a predetermined value. Modification is carried out in such a way that the energy of a sum signal of the first and second modified channel (L', R') bears a predetermined ratio in relation to the energy of a sum signal of the first and second channel and that a difference signal of the first and second modified channel is muffled in relation to the difference signal of the first and second channel. The side channel is muffled, especially for audio coders that require a constant output bit rate, when the coding of stereo audio signals cannot observe the output bit rate of the coder. Stereo channel separation is thus abandoned in favour of an increased audio bandwidth or a reduction of quantisation interference.