Abstract:
A digital discontinuous cellular communication system has a transmitter that transmits two frames of data following detection of voice inactivity. A receiver includes a comfort noise generator that uses the two frames of data to output noise to the speaker during period of voice inactivity. The comfort noise generator includes synthesis codebook with samples scaled by actual background noise and excitation codebook with samples filtered and scaled by the background noise that are combined to produce comfort noise having attributes and loudness level of the received background noise prior to interruption of transmission. The scaled signals are weighted to vary the loudness level and spectral attributes.
Abstract:
A digital discontinuous cellular communication system has a transmitter that transmits two frames of data following detection of voice inactivity. A receiver includes a comfort noise generator that uses the two frames of data to output noise to the speaker during period of voice inactivity. The comfort noise generator includes synthesis codebook with samples scaled by actual background noise and excitation codebook with samples filtered and scaled by the background noise that are combined to produce comfort noise having attributes and loudness level of the received background noise prior to interruption of transmission. The scaled signals are weighted to vary the loudness level and spectral attributes.
Abstract:
The present invention relates to a device and a method for communicating in a mobile communication system. The method provides a carrier signal having a plurality of frames. Each frame has a plurality of time slots, and each time slot comprises a plurality of transmission bits. A group of time slots are assigned to a communication channel. A traffic burst signal having a plurality of traffic symbols is transmitted over the communication channel by transmitting a first preamble over one of the assigned time slots, and transmitting a second preamble and at least one of the traffic symbols over at least one of the other assigned time slots. The second preamble occupies fewer transmission bits than the first preamble. The apparatus for transmitting a telephony signal over an RF channel includes a modem receiving a digitized PCM telephony signal and producing a traffic burst signal, and a transmitting unit in communication with the modem for transmitting a FDMA/TDMA signal carrying a plurality of traffic burst signals. At least one of the traffic burst signals carries a limited preamble message including a header field and a unique word field and at least one digitized voice message associated with a telephone call. Another traffic burst signal carries at least one signal acquisition message including a unique word field.
Abstract:
The present invention provides a multi-mode CELP encoding and decoding method and device for digitized speech signals providing improvements over prior art codecs and coding methods by selectively utilizes backward prediction for the short-term predictor parameters and fixed codebook gain of a speech signal. In order to achieve these improvements, the present invention provides a coding method comprising the steps of classifying a segment of the digitized speech signal as one of a plurality of predetermined modes, determining a set of unquantized line spectral frequencies to represent the short term predictor parameters for that segment, and quantizing the determined set of unquantized line spectral frequencies using a mode-specific combination of scalar quantization and vector quantization, which utilizes backward prediction for modes with voiced speech signals. Furthermore, backward prediction is selectively applied to the fixed codebook gain in the modes that are free of transients so that it may be used in the fixed codebook search and fixed codebook gain quantization in those modes.
Abstract:
A system and method is provided that employs a frequency domain interpolative CODEC system for low bit rate coding of speech which comprises a linear prediction (LP) front end adapted to process an input signal that provides LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal. An open loop pitch estimator adapted to process the LP residual signal, a pitch quantizer, and a pitch interpolator and provide a pitch contour within the predetermined intervals is also provided. Also provided is a signal processor responsive to the LP residual signal and the pitch contour and adapted to perform the following: provide a voicing measure, where the voicing measure characterizes a degree of voicing of the input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals; extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined intervals; normalize the PW by a gain value of the PW; encode a magnitude of the PW; and directly quantize the PW in a magnitude domain without further decomposition of the PW into complex components, where the direct quantization is performed by a hierarchical quantization method based on a voicing classification using fixed dimension vector quantizers (VQ's).
Abstract:
An improved noise reduction algorithm is provided, as well as a voice activity detector, for use in a voice communication system. The voice activity detector allows for a reliable estimate of noise and enhancement of noise reduction. The noise reduction algorithm and voice activity detector can be implemented integrally in an encoder or applied independently to speech coding application. The voice activity detector employs line spectral frequencies and enhanced input speech which has undergone noise reduction to generate a voice activity flag. The noise reduction algorithm employs a smooth gain function determined from a smoothed noise spectral estimate and smoothed input noisy speech spectra. The gain function is smoothed both across frequency and time in an adaptive manner based on the estimate of the signal-to-noise ratio. The gain function is used for spectral amplitude enhancement to obtain a reduced noise speech signal. Smoothing employs critical frequency bands corresponding to the human auditory system. Swirl reduction is performed to improve overall human perception of decoded speech.
Abstract:
An improved error control coding scheme is implemented in low bit rate coders in order to improve their performance in the presence of transmission errors typical of the digital cellular channel. The error control coding scheme exploits the nonlinear block codes (NBCs) for purposes of tailoring those codes to a fading channel in order to provide superior error protection to the compressed half rate speech data. For a half rate speech codec assumed to have a frame size of 40 ms, the speech encoder puts out a fixed number of bits per 40 ms. These bits are divided into three distinct classes, referred to as Class 1, Class 2 and Class 3 bits. A subset of the Class 1 bits are further protected by a CRC for error detection purposes. The Class 1 bits and the CRC bits are encoded by a rate 1/2 Nordstrom Robinson code with codeword length of 16. The Class 2 bits are encoded by a punctured version of the Nordstrom Robinson code. It has an effective rate of 8/14 with a codeword length 14. The Class 3 bits are left unprotected. The coded Class 1 plus CRC bits, coded Class 2 bits, and the Class 3 bits are mixed in an interleaving array of size 16.times.17 and interleaved over two slots in a manner that optimally divides each codeword between the two slots. At the receiver the coded Class 1 plus CRC bits, coded Class 2 bits, and Class 3 bits are extracted after de-interleaving.
Abstract:
A system and method is provided that employs a frequency domain interpolative CODEC system for low bit rate coding of speech which comprises a linear prediction (LP) front end adapted to process an input signal providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal. An open loop pitch estimator adapted to process the LP residual signal, a pitch quantizer, and a pitch interpolator also provides a pitch contour within the predetermined intervals. A voice activity detector adapted to process the LP parameters and the open loop pitch contour over the predetermined intervals is also provided as well as a signal processor responsive to the LP residual signal and the pitch contour and adapted to perform the following functions: extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined invervals; normalize the PW by a gain value of the PW; encode a magnitude of the PW; and provide a voicing measure where the voicing measure characterizes a degree of vocing of the input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals. The voicing measure is provided for the purpose of regenerating a PW phase at a decoder; and providing improved quantization of the PW magnitude at an encoder. The voicing measure is encoded jointly with a PW nonstationarity measure vector using a spectrally weighted vector quantizer having a codebook partioned based on a voiced and unvoiced mode.
Abstract:
A drift-free hybrid method of performing video stitching is provided. The method includes decoding a plurality of video bitstreams and storing prediction information. The decoded bitstreams form video images, spatially composed into a combined image. The image comprises frames of ideal stitched video sequence. The method uses prediction information in conjunction with previously generated frames to predict pixel blocks in the next frame. A stitched predicted block in the next frame is subtracted from a corresponding block in a corresponding frame to create a stitched raw residual block. The raw residual block is forward transformed, quantized, entropy encoded and added to the stitched video bitstream along with the prediction information. Also, the stitched raw residual block is inverse transformed and dequantized to create a stitched decoded residual block. The residual block is added to the predicted block to generate the stitched reconstructed block in the next frame of the sequence.
Abstract:
Encoding of prototype waveform components applicable to GeoMobile and Telephony Earth Station (TES) providing improved voice quality enabling a dual-channel mode of operation which permits more users to communicate over the same physical channel. A prototype word (PW) gain is vector quantized using a vector quantizer (VQ) that explicitly populates the codebook by representative steady state and transient vectors of PW gain for tracking the abrupt variations in speech levels during onsets and other non-stationary events, while maintaining the accuracy of the speech level during stationary conditions. The rapidly evolving waveform (REW) and slowly evolving waveform (SEW) component vectors are converted to magnitude-phase. The variable dimension SEW magnitude vector is quantized using a hierarchical approach, i.e., a fixed dimension SEW mean vector computed by a sub-band averaging of SEW magnitude spectrum, and only the REW magnitude is explicitly encoded. The REW magnitude vector sequence is normalized to unity RMS value, resulting in a REW magnitude shape vector and a REW gain vector. The normalized REW magnitude vectors are modeled by a multi-band sub-band model which converts the variable dimension REW magnitude shape vectors, e.g., six dimensional REW sub-band vectors. The sub-band vectors are averaged over time, resulting in a single average REW sub-band vector for each frame. At the decoder, the full-dimension REW magnitude shape vector is obtained from the REW sub-band vector by a piecewise-constant construction. The REW phase vector is regenerated at the decoder based on the received REW gain vector and the voicing measure, which determines a weighted mixture of SEW component and a random noise that is passed through a high pass filter to generate the REW component. The high pass filter poles are adjusted based on the voicing measure to control the REW component characteristics. At the output the filter, the magnitude of the REW component is scaled to match the received REW magnitude vector.