摘要:
A loudness enhancement system and method is described that increases the loudness of an audio signal being played back by an audio device that places limits on the dynamic range of the audio signal. In an embodiment, the loudness enhancement system and method compresses the audio signal to an adaptively-determined compression limit that is greater than or equal to a maximum desired output level and then applies an adaptively-determined degree of soft clipping to the compressed audio signal. The compression limit and degree of soft clipping may be determined based on an overload measure that is calculated for successive portions of the audio signal. The loudness enhancement system and method advantageously operates in a manner that generates less distortion than the method of simply over-driving the audio signal such that hard-clipping occurs.
摘要:
A speech intelligibility enhancement (SIE) system and method is described that improves the intelligibility of a speech signal to be played back by an audio device when the audio device is located in an environment with loud acoustic background noise. In an embodiment, the audio device comprises a near-end telephony terminal and the speech signal comprises a speech signal received over a communication network from a far-end telephony terminal for playback at the near-end telephony terminal.
摘要:
A method for performing packet loss concealment (PLC) and/or frame erasure concealment (FEC) in a speech decoder of a voice communication system. In accordance with the method, if a segment of an encoded speech signal is determined to be bad, an excitation signal is derived by scaling a random sequence of samples, and long-term and short-term predictive parameters are derived based on parameters associated with a previously-decoded segment. The excitation signal is then filtered by a long-term synthesis filter and a short-term synthesis filter under the control of the respective long-term and short-term predictive parameters. If the number of consecutively-received bad segments exceeds a predetermined threshold, the decoded speech signal is gradually reduced.
摘要:
Multi-channel noise suppression systems and methods are described that omit the traditional delay-and-sum fixed beamformer in devices that include a primary speech microphone and at least one noise reference microphone with the desired speech being in the near-field of the device. The multi-channel noise suppression systems and methods use a blocking matrix (BM) to remove desired speech in the input speech signal received by the noise reference microphone to get a “cleaner” background noise component. Then, an adaptive noise canceler (ANC) is used to remove the background noise in the input speech signal received by the primary speech microphone based on the “cleaner” background noise component to achieve noise suppression. The filters implemented by the BM and ANC are derived using closed-form solutions that require calculation of time-varying statistics of complex frequency domain signals in the noise suppression system.
摘要:
A technique is described for concealing the effect of a lost frame in a series of frames representing an encoded audio signal in a sub-band predictive coding system. In accordance with the technique, a first synthesized sub-band audio signal is synthesized, wherein synthesizing the first synthesized sub-band audio signal comprises performing waveform extrapolation based on a stored first sub-band decoded audio signal. A second synthesized sub-band audio signal is also synthesized, wherein synthesizing the second synthesized sub-band audio signal comprises performing waveform extrapolation based on the stored second sub-band decoded audio signal. The first synthesized sub-band audio signal and the second synthesized sub-band audio signal are combined to generate a synthesized full-band output audio signal corresponding to a lost frame.
摘要:
Typical communication systems operate with a single channel decoder, and hence would have to settle for the performance from the single channel decoder regardless of the conditions of the communications channel. The present invention uses a hybrid channel decoder comprising multiple channel decoders, each configured to optimize the quality of the re-constructed signal for different channel conditions. Therefore, the desired decoder can be selected as conditions of the communications channel, or the data signal, change over time, so as to optimize the re-constructed data signal. In embodiments, the data signal is a speech signal.
摘要:
A speech intelligibility enhancement (SIE) system and method is described that improves the intelligibility of a speech signal to be played back by an audio device when the audio device is located in an environment with loud acoustic background noise. In an embodiment, the audio device comprises a near-end telephony terminal and the speech signal comprises a speech signal received over a communication network from a far-end telephony terminal for playback at the near-end telephony terminal.
摘要:
Systems and methods are described for performing packet loss concealment using an extrapolation of an excitation waveform in a sub-band predictive speech coder, such as an ITU-T Recommendation G.722 wideband speech coder. The systems and methods are useful for concealing the quality-degrading effects of packet loss in a sub-band predictive coder and address some sub-band architectural issues when applying excitation extrapolation techniques to such sub-band predictive coders.
摘要:
A filter controller processes a decoded speech (DS) signal. The DS signal has a spectral envelope including a first plurality of formant peaks having different respective amplitudes. The controller produces, from the DS signal, a spectrally-flattened DS signal that is a time-domain signal. The spectrally-flattened time-domain DS signal has a spectral envelope including a second plurality of formant peaks. Each of the second plurality of formant peaks approximately coincides in frequency with a respective one of the first plurality of formant peaks. Also, the second plurality of formant peaks have approximately equal respective amplitudes. Next, the controller derives, from the spectrally-flattened time-domain DS signal, a set of filter coefficients representative of a filter response that is to be used to filter the DS signal.
摘要:
An audio decoding system performs frame loss concealment (FLC) when portions of a bit stream representing an audio signal are lost within the context of a digital communication system. The audio decoding system employs two different FLC methods: one designed to perform well for music, and the other designed to perform well for speech. When a frame is deemed lost, the audio decoding system analyzes a previously-decoded audio signal corresponding to previously-decoded frames of an audio bit-stream. Based on the results of the analysis, the lost frame is classified as either speech or music. Using this classification, other signal analysis, and knowledge of the employed FLC methods, the audio decoding system selects the appropriate FLC method which then performs FLC on the lost frame.