摘要:
A method for performing a domain transformation of a digital signal from the time domain into the frequency domain and vice versa, the method including performing the transformation by a transforming element, the transformation element comprising a plurality of lifting stages, wherein the transformation corresponds to a transformation matrix and wherein at least one lifting stage of the plurality of lifting stages comprises at least one auxiliary transformation matrix and a rounding unit, the auxiliary transformation matrix comprising the transformation matrix itself or the corresponding transformation matrix of lower dimension. The method further comprising performing a rounding operation of the signal by the rounding unit after the transformation by the auxiliary transformation matrix.
摘要:
Systems and methods for scalably encoding and decoding coded data are presented. One exemplary method for scalably coding data includes classifying, based upon at least one predetermined criteria, each of the plurality of data received as either (i) perceptually relevant data or (ii) perceptually irrelevant data. The perceptually relevant data is scalably coded, and the perceptually irrelevant data is non-scalably coded. Subsequently, the scalably coded perceptually relevant data and the non-scalably coded perceptually irrelevant are combined into a coded data stream for transmission.
摘要:
According to the process for determining a transform element for a given transformation function, which transformation function comprises a transformation matrix and corresponds to a transformation of a digital signal from the time domain into the frequency domain or vice versa, the transformation matrix is decomposed into a rotation matrix (306) and an auxiliary matrix (307) which, when multiplied with itself, equals a permutation matrix multiplied with an integer diagonal matrix. Further, the rotation matrix (306) and the auxiliary matrix (307) are each decomposed into a plurality of lifting matrices (308). Further, the transforming element is determined to comprise of a plurality of lifting stages (309) which correspond to the lifting matrices (308). The invention further provides a method for the transformation of a digital signal from the time domain into the frequency domain according to the transforming element determined by the process described above.
摘要:
Speech enhancement based on a psycho-acoustic model is disclosed that is capable of preserving the fidelity of speech while sufficiently suppressing noise including the processing artifact known as “musical noise”.
摘要:
A method for encoding a digital signal into a scalable bitstream comprising quantizing the digital signal, and encoding the quantized signal to form a core-layer bitstream, performing an error mapping based on the digital signal and the core-layer bitstream to remove information that has been encoded into the core-layer bitstream, resulting in an error signal, bit-plane coding the error signal based on perceptual information of the digital signal, resulting in an enhancement-layer bitstream, wherein the perceptual information of the digital signal is determined using a perceptual model, and multiplexing the core-layer bitstream and the enhancement-layer bitstream, thereby generating the scalable bitstream. A method for decoding a scalable bitstream into a digital signal comprising de-multiplexing the scalable bitstream into a core-layer bitstream and an enhancement-layer bitstream, decoding and de-quantizing the core-layer bitstream to generate a core-layer signal, bit-plane decoding the enhancement-layer bitstream based on perceptual information of the digital signal, and performing an error mapping based on the bit-plane decoded enhancement-layer bitstream and the de-quantized core-layer signal, resulting in an reconstructed transformed signal, wherein the reconstructed transformed signal is the digital signal.
摘要:
According to one embodiment, a method for transmitting a digital signal is provided that includes dividing data representing the digital signal into a plurality of data blocks, processing each data block in accordance with a desired amount of data included in the data block, determining, for each processed data block, the size of the processed data block, generating a message including, in a message body of the message, the processed data blocks and, for each data block, a message field specifying the size of the processed data block and transmitting the message.
摘要:
Transfer functions like Head Related Transfer Functions (HRTF) needed for binaural rendering are implemented efficiently by a subband-domain filter structure. In one implementation, amplitude, fractional-sample delay and phase-correction filters are arranged in cascade with one another and applied to subband signals that represent spectral content of an audio signal in frequency subbands. Other filter structures are also disclosed. These filter structures may be used advantageously in a variety of signal processing applications. A few examples of audio applications include signal bandwidth compression, loudness equalization, room acoustics correction and assisted listening for individuals with hearing impairments.
摘要:
Enhancing speech components of an audio signal composed of speech and noise components includes controlling the gain of the audio signal in ones of its subbands, wherein the gain in a subband is reduced as the level of estimated noise components increases with respect to the level of speech components, wherein the level of estimated noise components is determined at least in part by (1) comparing an estimated noise components level with the level of the audio signal in the subband and increasing the estimated noise components level in the subband by a predetermined amount when the input signal level in the subband exceeds the estimated noise components level in the subband by a limit for more than a defined time, or (2) obtaining and monitoring the signal-to-noise ratio in the subband and increasing the estimated noise components level in the subband by a predetermined amount when the signal-to-noise ratio in the subband exceeds a limit for more than a defined time.
摘要:
The invention relates to the coding of audio signals that may include both speech-like and non-speech-like signal components. It describes methods and apparatus for code excited linear prediction (CELP) audio encoding and decoding that employ linear predictive coding (LPC) synthesis filters controlled by LPC parameters, a plurality of codebooks each having codevectors, at least one codebook providing an excitation more appropriate for non-speech-like signals and at least one codebook providing an excitation more appropriate for speech-like signals, and a plurality of gain factors, each associated with a codebook. The encoding methods and apparatus select from the codebooks codevectors and/or associated gain factors by minimizing a measure of the difference between the audio signal and a reconstruction of the audio signal derived from the codebook excitations. The decoding methods and apparatus generate a reconstructed output signal from the LPC parameters, codevectors, and gain factors.
摘要:
A method for transforming a digital signal from the time domain into the frequency domain and vice versa using a transformation function comprising a transformation matrix, the digital signal comprising data symbols which are grouped into a plurality of blocks, each block comprising a predefined number of the data symbols. The method includes the process of transforming two blocks of the digital signal by one transforming element, wherein the transforming element corresponds to a block-diagonal matrix comprising two sub matrices, wherein each sub-matrix comprises the transformation matrix and the transforming element comprises a plurality of lifting stages and wherein each lifting stage comprises the processing of blocks of the digital signal by an auxiliary transformation and by a rounding unit.