摘要:
A multi-channel audio encoder (10) for encoding a multi-channel audio signal (101), e.g. a 5.1 channel audio signal, into a spatial down-mix (102), e.g. a stereo signal, and associated parameters (104, 105). The encoder (10) comprises first and second units (110, 120). The first unit (110) encodes the multi-channel audio signal (101) into the spatial down-mix (102) and parameters (104). These parameters (104) enable a multi-channel decoder (20) to reconstruct the multi-channel audio signal (203) from the spatial down-mix (102). The second unit (120) generates, from the spatial down-mix (102), parameters (105) that enable the decoder to reconstruct the spatial down-mix (202) from an alternative down-mix (103), e.g. a so-called artistic down-mix that has been manually mixed in a sound studio. In this way, the decoder (20) can efficiently deal with a situation in which an alternative down-mix (103) is received instead of the regular spatial, down-mix (102). In the decoder (20), first the spatial down-mix (202) is reconstructed from the alternative down-mix (103) and the parameters (105). Next, the spatial down-mix (202) is decoded into the multi-channel audio signal (203).
摘要:
Coding of an audio signal represented by a respective set of sampled signal values for each of a plurality of sequential segments is disclosed. The sampled signal values are analyzed (40) to determine one or more sinusoidal components for each of the plurality of sequential segments. The sinusoidal components are linked (42) across a plurality of sequential segments to provide sinusoidal tracks. For each sinusoidal track, a phase comprising a generally monotonically changing value is determined and an encoded audio stream including sinusoidal codes (r) representing said phase is generated (46).
摘要:
An encoder includes a segmentation unit for segmenting an audio or speech signal into at least one segment and a calculation unit for calculating sinusoidal code data in the form of frequency and amplitude data of a given extension from the segment such that the extension approximates the segment for a given criterion. The calculation of the sinusoidal code data θki, dji and eji for the segment x(n) is carried out according to the following extension {circumflex over (x)}: x ⋒ = ∑ i = 1 L ∑ j = 0 J - 1 [ d j i f j ( n ) cos ( Θ i ( n ) ) + e j i f j ( n ) sin ( Θ i ( n ) ] . Fig . 1.
摘要:
In a sinusoidal audio encoder it is known to use different time scales for analyzing different parts of the frequency spectrum. In prior art encoders sub-band filtering is used to split the input signal into a number of sub bands. By splitting the input signal into sub-bands, it can happen that a signal component at the boundary of two sub-bands results in a representation in both sub-band signals. This double representation of signal components can lead to several problems when coding these components. According to the present invention it is proposed to use preventing means (46, 48, 58, 68; 88, 92, 96) to avoid signal components to have multiple representations.
摘要:
Coding of an audio signal (x) represented by a respective set of sampled signal values (x(t)) for each of a plurality of sequential time segments is disclosed. The sampled signal values are analyzed to determine one or more sinusoidal components for each of the plurality of sequential segments. The sinusoidal components are linked across a plurality of sequential segments to provide sinusoidal tracks, where each track comprises a number of frames. An encoded signal (AS) is generated, including sinusoidal codes (Cs) comprising a representation level (r) for each frame or including sinusoidal codes (Cs) where some of these codes comprise a phase (φ), a frequency (ω) and a quantization table (Q) for a given frame when the given frame is designated as a random-access frame. The invention allows random access in a track while avoiding long adaptation of the quantization accuracy in a quantizer and/or the need for a large bit stream while still maintaining improved audio quality.
摘要:
A multi channel encoder (100) comprises a multi channel linear predictive analyzer (105) for linear predictive coding of a multi channel signal. A prediction controller (101) comprises a prediction parameter generator (301) which generates linear prediction coding parameter matrices for the multi channel signal which are then mapped to reflection matrices. The reflection matrices may specifically be normalized backward or forward reflection matrices. The reflection matrices are encoded by a reflection parameter encoder (305) and combined with other encoded data in a multiplexer (109) to generate encoded data for the multi channel signal. The reflection parameter encoder (305) may specifically decompose the reflection matrices using an Eigenvalue decomposition or a singular value decomposition and the resulting data may be quantized for transmission. A decoder (200) receives the encoded data and obtains the prediction parameters by performing the inverse operation.
摘要:
The present invention relates to a method of encoding and decoding an audio signal. The invention further relates to an arrangement for encoding and decoding an audio signal. The invention further relates to a computer-readable medium comprising a data record indicative of an audio signal and a device for communicating an audio signal having been encoded according to the present invention. By the method of encoding, a double description of the signal is obtained, where the encoding comprises two encoding steps, a first standard encoding and an additional second encoding. The second encoding is able to give a coarse description of the signal, such that a stochastic realization can be made and appropriate parts can be added to the decoded signal from the first decoding. The required description of the second encoder in order to make the realization of a stochastic signal possible requires a relatively low bit rate, while other double/multiple descriptions require a much higher bit rate.
摘要:
An audio encoding device (100) comprises first encoding means (101, 111) for encoding transient signal components and/or sinusoidal signal components of an audio signal (x(n)) and producing a residual signal (z(n)), and second encoding means for encoding the residual signal. The second encoding means comprise filter means (122) for selecting at least two frequency bands of the residual signal. The selected frequency bands (LF, HF) of the residual signal (z(n)) are encoded by a first encoding unit (123) and a second encoding unit (124) respectively. The first encoding unit (123) may comprise a waveform encoder, such as a time-domain encoder, while the second encoding unit (124) may comprise a noise encoder.
摘要:
A method of encoding a digital audio signal, wherein for each time segment the signal is spectrally flattened to obtain a spectrally flattened signal (r) and possibly spectral flattening parameters (LPP). The spectrally flattened signal is modelled by an excitation signal comprising a first partial excitation signal (px) conforming to an excitation signal generated by an RPE or CELP technique, and a second partial excitation signal (PEp) being a set of extra pulses with arbitrary positions and amplitudes. An audio bit stream as comprising the first and second partial excitation signals is generated. The extra pulses can be added to the excitation signal at positions in time that correspond to the time of occurrence of the spike, or preferably at positions in time of an RPE time grid.
摘要翻译:一种对数字音频信号进行编码的方法,其中对于每个时间段,信号被频谱平坦化以获得频谱平坦的信号(r)和可能的频谱平坦化参数(LPP)。 频谱平坦化的信号由包括符合由RPE或CELP技术产生的激励信号的第一部分激励信号(p SUB>)和第二部分激励信号(P p SUB2> SUB> 产生包括第一和第二部分激励信号的音频比特流。 额外的脉冲可以在对应于尖峰发生时间的时间点,或优选地在RPE时间网格的时间位置处被加到激励信号上。