Abstract:
The present invention is a method for determining linear predictive coding filter parameters for encoding a voice signal. The method includes sampling a voice signal, grouping the samples into a plurality of frames, generating a plurality of reflection coefficients for each frame of samples, quantizing the reflection coefficients, generating spectral coefficients from the quantized reflection coefficients, selecting a quantized reflection coefficient having the smallest log-spectral distance between a quantized spectrum, and an unquantized spectrum and, converting the selected quantized reflection coefficient to linear predictive coding (LPC) filter coefficient.
Abstract:
Linear predictive coding (LPC) filter parameters are determined for use in encoding a voice signal. Samples of a speech signal using a z-transform function are pre-emphasized. The pre-emphasized samples are analyzed to produce LPC reflection coefficients. The LPC reflection coefficients are quantized by a voiced quantizer and by an unvoiced quantizer producing sets of quantized reflection coefficients. Each set is converted into respective spectral coefficients. The set which produces a smaller lag-spectral distance is determined. The determined set is selected to encode the voice signal.
Abstract:
To perform pitch analysis for encoding a speech signal, a speech signal is sampled. The sampled speech signal is spectrally whitened to produce a spectral residual signal. Samples of the spectral residual signal are collected and the collected samples are autocorrelated. Maximum values of the correlated result are determined. Gain values are determined based on at least in part the maximum values of the correlated result. The gain values are quantized using a codebook to produce a codebook index and an associated frame delay. The codebook index and the frame delay represent a pitch of the speech signal to facilitate encoding the speech signal.
Abstract:
The generation of multipulse excitation codes by digitizing an original speech, partitioning the digitized signal into a number of samples, pre-emphasizing the samples, producing linear predictive reflection coefficients from said samples, quantizing these reflection coefficients, converting the quantized reflection coefficients to spectral coefficients and subjecting the spectral coefficients to pitch analysis to obtain a spectral residual signal.
Abstract:
The present invention is a synthetic speech encoding device that produces a synthetic speech signal which closely matches an actual speech signal. The actual speech signal is digitized, and excitation pulses are selected by minimizing the error between the actual and synthetic speech signals. The preferred pattern of excitation pulses needed to produce the synthetic speech signal is obtained by using an excitation pattern containing a multiplicity of weighted pulses at timed positions. The selection of the location and amplitude of each excitation pulse is obtained by minimizing an error criterion between the synthetic speech signal and the actual speech signal. The error criterion function incorporates a perceptual weighting filter which shapes the error spectrum.
Abstract:
A version of a speech signal and an output of a pitch synthesis filter and a linear predictive all-pole (LPC) filter is received. A system impulse response is produced based on in part the received pitch synthesis filter and LPC output. An excitation pulse location is determined so that the determined location minimizes an error between the speech signal version and the system impulse response. The speech signal is encoded with a representation of the determined location.
Abstract:
The generation of multipulse excitation codes by digitizing an original speech, partitioning the digitized signal into a number of samples, pre-emphasizing the samples, producing linear predictive reflection coefficients from said samples, quantizing these reflection coefficients, converting the quantized reflection coefficients to spectral coefficients and subjecting the spectral coefficients to pitch analysis to obtain a spectral residual signal.
Abstract:
A digital discontinuous cellular communication system has a transmitter that transmits two frames of data following detection of voice inactivity. A receiver includes a comfort noise generator that uses the two frames of data to output noise to the speaker during period of voice inactivity. The comfort noise generator includes synthesis codebook with samples scaled by actual background noise and excitation codebook with samples filtered and scaled by the background noise that are combined to produce comfort noise having attributes and loudness level of the received background noise prior to interruption of transmission. The scaled signals are weighted to vary the loudness level and spectral attributes.
Abstract:
A digital discontinuous cellular communication system has a transmitter that transmits two frames of data following detection of voice inactivity. A receiver includes a comfort noise generator that uses the two frames of data to output noise to the speaker during period of voice inactivity. The comfort noise generator includes synthesis codebook with samples scaled by actual background noise and excitation codebook with samples filtered and scaled by the background noise that are combined to produce comfort noise having attributes and loudness level of the received background noise prior to interruption of transmission. The scaled signals are weighted to vary the loudness level and spectral attributes.
Abstract:
A method of operating a digital signal processor to detect DTMF tones in a digital voice telephone system in which the digitally encoded signals appearing on the telephone channel are decimated to compress the spectrum to be monitored for the appearance of call signalling tones. The signals received in a decimated block are "correlated" or convolved with one another on a forward and backward time-shifted basis and each forward and backward correlation product is summed to form the elements of a 5.times.5 modified covariance matrix. The modified covariance matrix exhibits the desirable property that its eigenvectors will be symmetric. Since all eigenvectors of the modified covariance matrix are orthogonal and the eigenvectors associated with the signal span the signal subspace, the signal subspace is orthogonal to the eigenvector associated with the noise. The dot product of the noise eigenvector with the signal subspace is set to zero. The roots of the resultant polynomial identify the frequencies of the DTMF tones, if in fact the same were present in the received signal. The noise and signal eigenvectors of the modified covariance matrix are more quickly and efficiently determined and advantageously, on a "real time" basis, by partitioning the modified covariance matrix into conjugate and anti-conjugate submatrices. The conjugate matrix is inverted and its eigenvalues determined, advantageously by applying the well-known power method. The largest eigenvalue of the inverted conjugate submatrix is related to the smallest eigenvalue of the original modified covariance matrix. When appropriate tones are being received this last-mentioned eigenvector should be the eigenvector associated with the noise. After determining the noise eigenvector the product of the signal space vector and the noise eigenvector is set to zero and the roots of the resultant polynomial are identified as the frequencies of the DTMF tones, advantageously through the use of a fast search technique.