摘要:
A system and method is described that improves the intelligibility of a far-end telephone speech signal to a user of a telephony device in the presence of near-end background noise. As described herein, the system and method improves the intelligibility of the far-end telephone speech signal in a manner that does not require user input and that minimizes the distortion of the far-end telephone speech signal. The system is integrated with an acoustic echo canceller and shares information therewith.
摘要:
A technique is described for concealing the effect of a lost frame in a series of frames representing an encoded audio signal in a sub-band predictive coding system. In accordance with the technique, a first synthesized sub-band audio signal is synthesized, wherein synthesizing the first synthesized sub-band audio signal comprises performing waveform extrapolation based on a stored first sub-band decoded audio signal. A second synthesized sub-band audio signal is also synthesized, wherein synthesizing the second synthesized sub-band audio signal comprises performing waveform extrapolation based on the stored second sub-band decoded audio signal. The first synthesized sub-band audio signal and the second synthesized sub-band audio signal are combined to generate a synthesized full-band output audio signal corresponding to a lost frame.
摘要:
A method for adaptive long-term filtering of an audio signal, such as a decoded speech signal. The method includes measuring a smoothed periodicity of an audio signal segment, such as an audio frame, wherein the smoothed periodicity is measured by low-pass filtering an instantaneous periodicity of the audio signal segment. The periodicity of the audio signal segment is then increased in a manner that depends upon whether the smoothed periodicity is less than a predetermined threshold. By utilizing a smoothed periodicity measurement in this fashion, more accurate control of the post-filter is provided as compared to conventional solutions. Additionally, the method includes deriving filters by interpolating between filter responses of adjacent audio signal segments to minimize distortion at segment boundaries.
摘要:
A technique is described herein for updating a state of a decoder configured to decode a series of frames representing an encoded audio signal. In accordance with the technique, an output audio signal associated with a lost frame in the series of frames is synthesized. The decoder state is set to align with the synthesized output audio signal at a frame boundary. An extrapolated signal is generated based on the synthesized output audio signal. A time lag is calculated between the extrapolated signal and a decoded audio signal associated with a first received frame after the lost frame in the series of frames, wherein the time lag represents a phase difference between the extrapolated signal and the decoded audio signal. The decoder state is then reset based on the time lag.
摘要:
A technique is described herein for reducing audible artifacts in an audio output signal generated by decoding a received frame in a series of frames representing an encoded audio signal in a predictive coding system. In accordance with the technique, it is determined if the received frame is one of a predefined number of received frames that follow a lost frame in the series of the frames. Responsive to determining that the received frame is one of the predefined number of received frames, at least one parameter or signal associated with the decoding of the received frame is altered from a state associated with normal decoding. The received frame is then decoded in accordance with the at least one parameter or signal to generate a decoded audio signal. The audio output signal is then generated based on the decoded audio signal.
摘要:
A technique for concealing the effect of a lost frame in a series of frames representing an encoded audio signal in a sub-band predictive coding system is provided. In accordance with the technique, one or more received frames in the series of frames are decoded to generate a full-band output audio signal, wherein the full-band output audio signal comprises a combination of at least a first sub-band decoded audio signal and a second sub-band decoded audio signal. The full-band output audio signal corresponding to the one or more received frames is stored. Then, a full-band output audio signal corresponding to the lost frame is synthesized, wherein synthesizing the full-band output audio signal corresponding to the lost frame comprises performing waveform extrapolation based on the stored full-band output audio signal corresponding to the one or more received frames.
摘要:
Systems and methods are described for performing packet loss concealment using an extrapolation of an excitation waveform in a sub-band predictive speech coder, such as an ITU-T Recommendation G.722 wideband speech coder. The systems and methods are useful for concealing the quality-degrading effects of packet loss in a sub-band predictive coder and address some sub-band architectural issues when applying excitation extrapolation techniques to such sub-band predictive coders.
摘要:
In a Noise Feedback Coding (NFC) system operable in a ZERO-STATE condition and a ZERO-INPUT condition, the NFC system including at least one filter having a filter memory, a method of updating the filter memory. The method comprises: (a) producing a ZERO-STATE contribution to the filter memory when the NFC system is in the ZERO-STATE condition; (b) producing a ZERO-INPUT contribution to the filter memory when the NFC system is in the ZERO-INPUT condition; and (c) updating the filter memory as a function of both the ZERO-STATE contribution and the ZERO-INPUT contribution.
摘要:
In a Noise Feedback Coding (NFC) system having a corresponding ZERO-STATE filter structure, the first ZERO-STATE filter structure including multiple filters, a method of producing a ZERO-STATE response error vector. The method includes: (a) transforming the first ZERO-STATE filter structure to a second ZERO-STATE filter structure including only an all-zero filter, the all-zero filter having a filter response substantially equivalent to a filter response of the ZERO-STATE filter structure including multiple filters; and (b) filtering a VQ codevector with the all-zero filter to produce the ZERO-STATE response error vector corresponding to the VQ codevector.
摘要:
Techniques are described herein that suppress noise using multiple sensors (e.g., microphones) of a communication device. Noise modeling (e.g., estimation of noise basis vectors and noise weighting vectors) is performed with respect to a noise signal during operation of a communication device to provide a noise model. The noise model includes noise basis vectors and noise coefficients that represent noise provided by audio sources other than a user of the communication device. Speech modeling (e.g., estimation of speech basis vectors and speech weighting) is performed to provide a speech model. The speech model includes speech basis vectors and speech coefficients that represent speech of the user. A noisy speech signal is processed using the noise basis vectors, the noise coefficients, the speech basis vectors, and the speech coefficients to provide a clean speech signal.