摘要:
A random code vector reading section and a random codebook of a conventional CELP type speech coder/decoder are respectively replaced with an oscillator for outputting different vector streams in accordance with values of input seeds, and a seed storage section for storing a plurality of seeds. This makes it unnecessary to store fixed vectors as they are in a fixed codebook (ROM), thereby considerably reducing the memory capacity.
摘要:
A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.
摘要:
A lattice-structured multiple description vector quantization (LSMDVQ) encoder generates M descriptions of a signal to be encoded, each of the descriptions being transmittable over a corresponding one of M channels. The encoder is configured based at least in part on a distortion measure which is a function of a central distortion and at least one side distortion. For example, if M=2, the distortion measure may be an average mean-squared error (AMSE) function of the form ƒ(D0, D1, D2), where D0 is a central distortion resulting from reconstruction based on receipt of both a first and a second description, and D1 and D2 are side distortions resulting from reconstruction using only a first description and a second description, respectively. Further performance improvements may be obtained through perturbation of the lattice points. The LSMDVQ techniques of the invention can also be extended to cases of M greater than two, for which the encoder may utilize an ordered set of M codebooks &Lgr;1, &Lgr;2, . . . , &Lgr;M of increasing size, with the coarsest codebook corresponding to a lattice. In such cases, for each number k of descriptions received, there may be a single decoding function that maps the received vector to a corresponding one of the codebooks &Lgr;k, such that reconstruction of the signal requires no more than M such decoding functions.
摘要:
A pitch lag coding device and method using interframe correlation inherent in pitch lag values to reduce coding bit requirements. A pitch lag value is extracted for a given speech frame, and then refined for each subframe. For every speech frame having N samples of speech, LPC analysis and vector quantization are performed for the whole coding frame. The LPC residual obtained for each frame is then processed such that pitch lag values for all subframes within the coding frame are analyzed concurrently. The remaining coding parameters, i.e., the codebook search, gain parameters, and excitation signal, are then analyzed sequentially according to their respective subframes.
摘要:
A speech coder and a method for speech coding wherein the speech signal is represented by an excitation signal applied to a synthesis filter. The speech is partitioned into frames and subframes. A classifier identifies which of several categories the speech frame belongs to, and a different coding method is applied to represent the excitation for each category. For some categories, one or more windows are identified for the frame where all or most of the excitation signal samples are assigned by a coding scheme. Performance is enhanced by coding the important segments of the excitation more accurately. The window locations are determined from a linear prediction residual by identifying peaks of the smoothed residual energy contour. The method adjusts the frame and subframe boundaries so that each window is located entirely within a modified subframe or frame. This eliminates the artificial restriction incurred when coding a frame or subframe in isolation, without regard for the local behavior of the speech signal across frame or subframe boundaries.
摘要:
A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. To achieve high quality in lower bit rate encoding modes, the speech encoder departs from the strict waveform matching criteria of regular CELP coders and strives to identify significant perceptual features of the input signal. The encoder generates pluralities of codevectors from a single, normalized codevector by shifting or other rearrangement. As a result, searching speeds are enhanced, and the physical size of a codebook built from such codevectors is greatly reduced.
摘要:
A CELP type speech coder performs quantization of pitch differential value on pitch information between subframes. The coder limits the number of preliminary selected candidates using threshold processing. The coder includes specialized pitches for a subframe on which quantization of pitch differential value is not applied. When pitch preliminary selection is performed on such a subframe, the coder limits the number of preliminarily selected candidates using threshold processing to avoid outputting, as a preliminarily selected candidate, the above-mentioned specialized pitches. The coder improves the accuracy of the pitch search (adaptive codebook search) while avoiding adverse effects on the quantization of pitch differential value.
摘要:
An alternative approach by which periodicity enhancement of an excitation signal is achieved through filtering an innovative codevector by an innovation filter to reduce low frequency content of the innovative codevector and enhance the periodicity at low frequencies more than high frequencies.
摘要:
The present invention provides a method and system to improve the cookbook search algorithm used in a coding/decoding device or routine. The codebook search algorithm is performed by a processing system that allows for parallel execution of instructions, for example a DSP. An embodiment of the present invention provides a method for coding of a first waveform. First a plurality of vectors determined from a plurality of waveforms is stored in a memory. Next a minimum weighted error using a plurality of filter coefficients and the plurality of vectors is determined. The minimum weighted error gives a closest match between the first waveform and a second waveform synthesized from a selected vector of the plurality of vectors. Then an indication of said selected vector is provided as part of a code of the first waveform. The plurality of filter coefficients have added to them at least one duplicate filter coefficient such that the performance of determining the minimum weighted error is improved, by for example, at least one clock cycle.
摘要:
To transmit data over a mobile telephone speech channel a source encoder is replaced by a transcoder, a conversion table and/or a concatenation circuit in order to choose from the words produced by the source encoder the ones that are the most robust and which can without difficulty withstand speech synthesis followed by an inverse analysis to reconstitute streams of the bits of the data to be transmitted.