Abstract:
An encoder (100) for encoding a current frame of an audio signal depending on one or more previous frames of the audio signal according to an embodiment is provided. The one or more previous frames precede the current frame, wherein each of the current frame and the one or more previous frames comprises one or more harmonic components of the audio signal, wherein each of the current frame and the one or more previous frames comprises a plurality of spectral coefficients in a frequency domain or in a transform domain. To generate an encoding of the current frame, the encoder (100) is to determine an estimation of two harmonic parameters for each of the one or more harmonic components of a most previous frame of the one or more previous frames. Moreover, the encoder (100) is to determine the estimation of the two harmonic parameters for each of the one or more harmonic components of the most previous frame using a first group of three or more of the plurality of spectral coefficients of each of the one or more previous frames of the audio signal.
Abstract:
A method, system and computer program for encoding speech according to a source-filter model. The method comprises deriving a spectral envelope signal representative of a modelled filter and a first remaining signal representative of a modelled source signal, and deriving a second remaining signal from the first remaining signal by, at intervals during the encoding: exploiting a correlation between approximately periodic portions in the first remaining signal to generate a predicted version of a later portion from a stored version of an earlier portion, and using the predicted version of the later portion to remove an effect of said periodicity from the first remaining signal. The method further comprises, once every number of intervals, transforming the stored version of the earlier portion of the first remaining signal prior to generating the predicted version of the respective later portion.
Abstract:
An audio transmission system comprises: a decoder for converting a frame organized bitstream into an audio output representation; and a bad frame processing means arranged for detecting bad or disturbed frames in the bitstream. The audio transmission system further comprises a pitch period estimator coupled to said decoder audio output for estimating the pitch period of the audio representation and the pitch period estimator is further coupled to the bad frame processing means for replacing the audio output during a detected bad frame by a pitch period determined representation of said audio output. As a consequence no smoothing is necessary at the edges of neighboring frames.
Abstract:
An apparatus for decoding an audio signal is provided. The apparatus comprises a receiving interface (110), wherein the receiving interface (110) is configured to receive a first frame comprising a first audio signal portion of the audio signal, and wherein the receiving interface (110) is configured to receive a second frame comprising a second audio signal portion of the audio signal. Moreover, the apparatus comprises a noise level tracing unit (130), wherein the noise level tracing unit (130) is configured to determine noise level information depending on at least one of the first audio signal portion and the second audio signal portion, wherein the noise level information is represented in a tracing domain. Furthermore, the apparatus comprises a first reconstruction unit (140) for reconstructing, in a first reconstruction domain, a third audio signal portion of the audio signal depending on the noise level information, if a third frame of the plurality of frames is not received by the receiving interface (110) or if said third frame is received by the receiving interface (110) but is corrupted, wherein the first reconstruction domain is different from or equal to the tracing domain. Moreover, the apparatus comprises a transform unit (121) for transforming the noise level information from the tracing domain to a second reconstruction domain, if a fourth frame of the plurality of frames is not received by the receiving interface (110) or if said fourth frame is received by the receiving interface (110) but is corrupted, wherein the second reconstruction domain is different from the tracing domain, and wherein the second reconstruction domain is different from the first reconstruction domain. Furthermore, the apparatus comprises a second reconstruction unit (141) for reconstructing, in the second reconstruction domain, a fourth audio signal portion of the audio signal depending on the noise level information being represented in the second reconstruction domain, if said fourth frame of the plurality of frames is not received by the receiving interface (110) or if said fourth frame is received by the receiving interface (110) but is corrupted.
Abstract:
A speech communication unit (100) comprises a speech encoder (134) capable of representing an input speech signal. The speech encoder (134) comprises long-term prediction (LTP) logic having memory (215) operably coupled to quantization logic, wherein the quantization logic is arranged to quantize a memory state of the LTP logic. On the decoder side, a first speech decoder (260) receives the speech encoded bitstream and has a conventional long term predictor (LTP) memory element (265) that is driven by one or more previous codebook decisions made by the speech encoder. A second decoder (275) also receives the speech encoded bitstream and has an LTP memory element (280) that is updated by quantized values of a memory state of an LTP logic of the speech encoder.
Abstract:
The invention relates to a method for calculating the amplication factor, which co-determines the volume, for a speech signal transmitted in encoded form. Said speech signal is divided into short temporal signal segments. The individual signal segments are encoded and transmitted separately from each other, and the amplication factor for each signal segment is calculated, transmitted and used by the decoder to reconstruct the signal. The amplication factor is determined by minimizing the value E(g_opt2) = (1-a) *f>1 2
Abstract:
Techniques for acoustic echo cancellation are described herein. In an example embodiment, a system comprises a speaker, a microphone array with multiple microphones, a beamformer (BF) logic and an acoustic echo canceller (AEC) logic. The speaker is configured to receive a reference signal. The BF logic is configured to receive audio signals from the multiple microphones and to generate a beamformed signal. The AEC logic is configured to receive the beamformed signal and the reference signal. The AEC logic is also configured to compute a vector of bias coefficients multiple times per time frame, to compute a background filter coefficient based on the vector of bias coefficients, to apply a background filter to the reference signal and the beamformed signal based on the background filter coefficient, to generate a background cancellation signal, and to generate an output signal based at least on the background cancellation signal.
Abstract:
An apparatus for selecting one of a first encoding algorithm having a first characteristic and a second encoding algorithm having a second characteristic for encoding a portion of an audio signal to obtain an encoded version of the portion of the audio signal, comprises a filter configured to receive the audio signal, to reduce the amplitude of harmonics in the audio signal and to output a filtered version of the audio signal. A first estimator is provided for using the filtered version of the audio signal in estimating a SNR or a segmented SNR of the portion of the audio signal as a first quality measure for the portion of the audio signal, which is associated with the first encoding algorithm, without actually encoding and decoding the portion of the audio signal using the first encoding algorithm. A second estimator is provided for estimating a SNR or a segmented SNR as a second quality measure for the portion of the audio signal, which is associated with the second encoding algorithm, without actually encoding and decoding the portion of the audio signal using the second encoding algorithm. The apparatus comprises a controller for selecting the first encoding algorithm or the second encoding algorithm based on a comparison between the first quality measure and the second quality measure.
Abstract:
A vector code book (1094) where representative samples of vectors to be quantized are stored is created. Each vector is made up of three elements: an AC gain, a value corresponding the logarithm of an SC gain, and an adjustment coefficient of the prediction coefficient of SC. Coefficients for predictive coding are stored in a prediction coefficient storage section (1095). The coefficients are the prediction coefficients of MA, and two kinds of coefficients, AC and SC for the order of prediction are stored. A parameter calculating section (1091) calculates a parameter necessary for distance calculation from an auditory sensation weighting input voice, an adaptive sound source subjected to auditory weighting LPC synthesis, a probabilistic sound source subjected to auditory sensation weighting LPC synthesis, a decoded vector (AC, SC, adjustment coefficient) stored in a decoded vector storage section (1096), and the prediction coefficients (AC, SC) stored in the prediction coefficient storage section (1095).