摘要:
An improved hands-free user-interactive control and dialing system is disclosed for use with a speech communications device. The control system (400) includes a dynamic noise suppressor (410), a speech recognizer (420) for implementing voice-control, a device controller (430) responsive to the speech recognizer for controlling operating parameters of the speech communications device (450) and for producing status information representing the operating status of the device, and a speech synthesizer (440) for providing reply information to the user as to the speech communications device operating status. In a mobile radiotelephone application, the spectral subtraction noise suppressor (414) is configured to improve the performance of the speech recognizer (424), the voice quality of the transmitted audio (417), and the audio switching operation of the vehicular speakerphone (460). The combination of noise processing, speech recognition, and speech synthesis provides a substantial improvement to prior art control systems.
摘要:
A channel bank speech synthesizer for reconstructing speech from externally-generated acoustic feature information without using externally-generated voicing or pitch information is disclosed. An N-channel pitch-excited channel bank synthesizer (340) is provided having a first low-frequency group of channel gain values (1 to M) and a second high-frequency group of channel gain values (+1 to N). The first group controls a first group of amplitude modulators (950) excited by a periodic pitch pulse source (920), and the second group controls amplitude modulators excited by a noise source (930). Both groups of modulated excitation signals are applied to the bandpass filters (960) to reconstruct the speech channels, and then combined at the summation network (970) to form a reconstructed synthesized speech signal. Additionally, the pitch pulse source (920) varies the pitch pulse period such that the pitch pulse rate decreases over the length of the word.
摘要:
An automatic gain selector is disclosed for use with a noise suppression system which performs speech quality enhancement upon a noisy speech signal available at the input to generate a noise-suppressed speech signal at the output by spectral gain modification. The channel gain controller (240) of the present invention produces a modification signal (245), comprised of individual channel gain values, for application to a channel gain modifier (250). A particular gain table set is automatically selected from one of a plurality of gain tables (450) by a selector switch (470) and a noise level quantizer (440) in response to a multi-channel noise parameter, such as the overall average background noise level of the input signal. Then the individual channel gain values (455) are obtained from the particular gain table set in response to the individual channel signal-to-noise ratio estimate (235). Hence, each individual channel gain value is selected as a function of (a) the channel number, (b) the current channel SNR estimate, and (c) the overall average background noise level. The automatic gain selector further includes a gain smoothing filter (460) for smoothing these noise suppression gain factors on a per-sample basis thereby improving noise flutter performance caused by step discontinuities in frame-to-frame gain changes.
摘要:
An improved background noise estimator (320) is disclosed for use with a noise suppression system (300) for generating an estimate of the background noise power spectral density provided to noise suppressor (310), which performs speech quality enhancement upon the pre-processed speech-plus-noise signal available at the input to generate a clean post-processed speech signal at the output. Background noise estimator (320) utilizes an energy valley detector based upon post-processed speech to perform the speech/noise classification, and a noise spectral estimator based upon pre-processed speech to generate an estimate of the background noise power spectral density. As a result, the background noise estimate supplied to the noise suppressor is a more accurate measurement of the background noise energy, since it is performed during a more accurate determination of the occurrences of pauses in the speech.
摘要:
An improved noise suppression system (400) is disclosed which performs speech quality enhancement upon speech-plus-noise signal available at the input (205) to generate a clean speech signal at the output (265) by spectral gain modification. The noise suppression system of the present invention includes a background noise estimator (420) which generates and stores an estimate of the background noise power spectral density based upon pre-processed speech (215), as determined by the detected minima of the post-processed speech energy level. This post-processed speech (255) may be obtained directly from the output of the noise suppression system, or may be simulated by multiplying the pre-processed speech energy (225) by the channel gain values of the modification signal (245). This technique of implementing post-processed signal to generate the background noise estimate (325) provides a more accurate measurement of the background noise energy since it is based upon much cleaner speech signal. As a result, the present invention performs acoustic noise suppression in high ambient noise backgrounds with significantly less voice quality degradation.
摘要:
An improved noise suppression system (800) is disclosed which performs speech quality enhancement upon the speech-plus-noise signal available at the input (205) to generate a clean speech signal at the output (265) by spectral gain modification. The improvements of the present invention include the addition of a signal-to-noise ratio (SNR) threshold mechanism (830) to reduce background noise flutter by offsetting the gain rise of the gain tables until a certain SNR threshold is reached, the use of a voice metric calculator (810) to produce more accurate background noise estimates via performing the update decision based on the overall voice-like characteristics in the channels and the time interval since the last update, and the use of a channel SNR modifier (820) to provide immunity to narrowband noise bursts through modification of the SNR estimates based on the voice metric calculation and the channel energies.
摘要:
A digital speech coder utilizes harmonic noise weighting to overcome some limitations of low-rate CELP-type speech coders in reproducing voiced speech. In addition to a short term correction factor, which constitutes spectral noise weighting as known in the art, a long term pitch correction factor is utilized to provide harmonic noise weighting. The inclusion of harmonic noise weighting in a speech coder more efficiently utilizes noise-masking properties of a speech signal, allowing synthesis of a higher quality speech at a given bit rate.
摘要:
The present invention describes a method and arrangement for reducing a sequence of initial frames into a reduced set of representative frames by combining the initial frames into a plurality of representative frames, the combining process including generating a distortion measure associated with each representative frame and comparing each distortion measure to a distortion threshold. From these representative frames, a set of mutually exclusive frames is determined to minimize the number of representative frames, whereby each representative frame in the set represents a unique set of contiguous initial frames and has an associated distortion measure which does not exceed the distortion threshold.
摘要:
A multifrequency tone receiver is disclosed for detecting simultaneous tone signals in a sampled digital signal. The tone receiver includes a microprogrammed sequence controller, a time-multiplexed digital filter and a signal processing microcomputer. For each sample of the digital signal, the sequence controller is programmed to time multiplex the digital filter for performing three cascaded second order filtering operations (two bandpass filter operations and one low pass filter operation) for each of six tone signals to provide corresponding energy estimates and one additioal filtering operation to provide a total energy estimate. The signal processing microcomputer processes a number of sets of the seven energy estimates and provides an indication when a multifrequency toner pair has been detected. The digital filter, when enabled by a filter start signal from the sequence controller, asynchronously performs a single multiplication-like filtering operation to implement each second-order filter, and provides a filter done signal upon completion of the filtering operation. Full-wave rectifying capability is provided during low pass filtering operations by logically complementing the digital filter input signal. Limit cycles may be suppressed in the digital filter output signal by rounding the output signal and clamping positive and negative overflows to the largest allowable positive and negative signals, respectively. The tone receiver may be advantageously utilized in a PCM communication system for detecting multifrequency tone signalling used for dialing and supervisory control. Moreover, the inventive tone reciver may be adapted to receive many different types of tone signalling simply by changing firmware therewithin.
摘要:
An input speech signal is encoded as one or more reflection coefficients. To reduce storage requirements, the reflection coefficients are scalar quantized by storing an N-bit code rather than the entire reflection coefficient. An exemplary value for N is 8. A table is provided having 2.sup.N reflection coefficient values. The N-bit code is used to look up reflection coefficient values from the table. To reduce spectral distortion due to scalar quantization, the reflection coefficient values in the table are non-linearly scaled.