摘要:
A method, apparatus and computer program product are therefore provided according to an example embodiment of the present invention in order to perform categorical analysis and synthesis of a multichannel signal to synthesize binaural signals and extract, separate, and manipulate components within the audio scene of the multichannel signal that were captured through multichannel audio means. In the context of a method, a multichannel signal is received. The method may include computing the spectrum for the multichannel signal, determining tonality of bands within the spectrum, and generating a band structure for the spectrum. The method may also include performing spatial analysis of the bands, performing source filtering using the bands, performing synthesis on the filtered band components, and generating an output signal. A corresponding apparatus and a computer program product are also provided.
摘要:
A coder and decoder, and methods therein, are provided for coding and decoding of spectral peak positions in audio coding. According to a first aspect, an audio signal segment coding method is provided for coding of spectral peak positions. The method comprises determining which one out of two lossless spectral peak position coding schemes that requires the least number of bits to code the spectral peak positions of an audio signal segment; and selecting the spectral peak position coding scheme that requires the least number of bits to code the spectral peak positions of the audio signal segment. A first one of the two lossless spectral peak position coding schemes is suitable for periodic or semi-periodic spectral peak position distributions; and a second one of two lossless spectral peak position coding schemes is suitable for sparse spectral peak position distributions.
摘要:
The present invention relates to a method for processing an audio signal, comprising: a step of performing a frequency conversion process on an audio signal to obtain a plurality of frequency transform coefficients; a step of selecting either a general mode or a non-general mode, on the basis of a pulse ratio, for the frequency transform coefficients having a high frequency band from among the plurality of frequency transform coefficients; and a step of performing, if the non-general mode is selected, the following steps: extracting a predetermined number of pulses from the frequency transform coefficients having the high frequency band, and generating pulse information; generating an original noise signal from the frequency transform coefficients having the high frequency band, excluding the pulses; generating a reference noise signal using the frequency transform coefficient having a low frequency band from among the plurality of frequency transform coefficients; and generating noise position information and noise energy information using the original noise signal and the reference noise signal.
摘要:
Embodiments of the present invention relate to adaptive audio content generation. Specifically, a method for generating adaptive audio content is provided. The method comprises extracting at least one audio object from channel-based source audio content, and generating the adaptive audio content at least partially based on the at least one audio object. Corresponding system and computer program product are also disclosed.
摘要:
In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.
摘要:
In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.
摘要:
There is provided a method of encoding an input speech signal. The method comprises identifying a fixed codebook vector from a fixed codebook; identifying an adaptive codebook vector from a adaptive codebook; calculating an adaptive codebook gain; reducing the adaptive codebook gain by an amount; optimally selecting a fixed codebook gain based on the adaptive codebook gain while both the fixed codebook vector and the adaptive codebook vector remain fixed; and converting the input speech signal into an encoded speech using the fixed codebook gain, the adaptive codebook gain, the fixed codebook vector and the adaptive codebook vector. The amount of reducing the adaptive codebook gain may be varied.
摘要:
A method includes obtaining a signal type of an audio signal and a low frequency band signal of the audio signal, where the audio signal includes the low frequency band signal and a high frequency band signal; obtaining a frequency envelope of the high frequency band signal according to the signal type; predicting an excitation signal of the high frequency band signal according to the low frequency band signal; and restoring the high frequency band signal according to the frequency envelope of the high frequency band signal and the excitation signal of the high frequency band signal. By using the technical solutions of the embodiments of the present invention, an error existing between a high frequency band signal obtained by prediction and an actual high frequency band signal can be effectively reduced, and an accuracy rate of the predicted high frequency band signal can be increased.
摘要:
In general, techniques are described for closed loop quantization of HOA coefficients that provide a three-dimensional representation of the sound field. An audio encoding device may perform closed loop quantization of an audio object based at least in part on a result of performing quantization of directional information associated with the audio object. An audio decoding device may obtain an audio object that has been closed loop quantized based at least in part on a result of performing quantization of directional information associated with the audio object, and may dequantize the audio object.
摘要:
In accordance with one aspect of the invention, a selector supports the selection of a first encoding scheme or the second encoding scheme based upon the detection or absence of the triggering characteristic in the interval of the input speech signal. The first encoding scheme has a pitch pre-processing procedure for processing the input speech signal to form a revised speech signal biased toward an ideal voiced and stationary characteristic. The pre-processing procedure allows the encoder to fully capture the benefits of a bandwidth-efficient, long-term predictive procedure for a greater amount of speech components of an input speech signal than would otherwise be possible. In accordance with another aspect of the invention, the second encoding scheme entails a long-term prediction mode for encoding the pitch on a sub-frame by sub-frame basis. The long-term prediction mode is tailored to where the generally periodic component of the speech is generally not stationary or less than completely periodic and requires greater frequency of updates from the adaptive codebook to achieve a desired perceptual quality of the reproduced speech under a long-term predictive procedure.