摘要:
The object of the invention is to provide a coding/decoding method in which degradation of sound quality perceptible by the listener does not occur at an low bit rate. A shift number calculation section of a decoding device divides a frequency domain into at least two sub-bands, and approximates each of normalized transform coefficients in the sub-band whose allocated bit value is less than a predetermined threshold using a quantized value of the transform coefficient in a predetermined sub-band other than the sub-band so as to obtain information concerning the approximation, and a multiplexer multiplexes the information and another signal and transmits them. A de-multiplexer of a decoding device separates the code of information concerning the approximation, and a shift number restore section restores the information based thereon. An approximation coefficient calculation section assigns, based on the information concerning the approximation, the transform coefficient values in the predetermined sub-band to the normalized transform coefficients whose allocated bit value is less than the predetermined threshold.
摘要:
An adaptive transform coding/and decoding arrangement is provided to effectively exploit different redundancies between the bands of a spectrum envelope to effect coding at a low bit rate for an audio signal. In the adaptive transform coding method, the spectrum envelope is divided into bands so that different coding methods may be applied to the spectrum envelopes of the individual bands. By applying the present invention to the adaptive transform coding of an audio signal, the spectrum envelope can be adjusted to the coding/and transmission method which is suitable for the time fluctuation in each frequency band, so that the different redundancies for the individual bands can be effectively exploited to realize a highly efficient audio signal coding/and decoding method which has its bits reduced as required for coding the spectrum envelope.
摘要:
A voice coding system for separating and coding voice information into spectrum envelope information and voice source information, with the intention of compressing the amount of information for efficient coding of vocal audio signals through the control of the voice source information based on the fact that the spectrum envelope information and voice source information highly correlate with each other.
摘要:
A character voice communication system including high efficiency voice coding system for encoding and transmitting speech information at a high efficiency and a voice character input/output system for converting speech information into character information or receiving character information and transmitting speech or character information are organically integrated. A speech analyzer and a speech synthesizer are shared by both the voice coding and the voice character input/output systems. Communication apparatus is also provided which allows mutual conversion between speech signals and character codes.
摘要:
In speech decoding, a transmission code, which includes an error correcting code added to a speech code, is received and whether or not there is a code error is detected on the basis of the error correcting code. At this time, when there is no code error or when the detected code error has been corrected, a normal speech decoding processing is executed. On the other hand, when there is a code error which is impossible to be corrected, artificially background sound corresponding to the decoded speech is generated from characteristic parameters indicating unvoiced sound in the decoded speech. The parameters are continuously extracted from the decoded speech, stored in a memory and are used to replace an erroneous portion of the speech code.
摘要:
This speech signal recognition system compares the two-dimensionals pattern (time sequence of feature vectors) of an unknown signal to prestored standard references patterns for recognition, thus forming a corresponding two-dimensional comparison pattern of points of elemental Hamming distance differences. The sum of the pattern point distances is the similarity measure. To improve accuracy, partial patterns are selected (or "masked") and tested sequentially, and the point values weighted relative to their location within the mask. The mask may be rectangular or oblique.
摘要:
A speech signal is analyzed for each frame so that it is separated into spectral envelope information and excitation information, and the excitation information is expressed by a plurality of pulses. Judgement is conducted as to whether the current frame is a voiced frame immediately after the transition from an unvoiced frame, a voiced frame continuative from a voiced frame or an unvoiced frame, and excitation pulses are generated in accordance with the judgement result. In case of a continuing voiced frame, the excitation pulse position of the current voiced frame is determined based on the pitch period with respect to the excitation pulse position of the immediately preceding voiced frame so that the excitation pulse train is generated at a position approximated to the determined position.
摘要:
A system for voice coding based on vector quantization has an apparatus in which a distribution area of parameters representative of a voice is divided into a plurality of domains so that one vector (code vector) may correspond to one domain, an apparatus for representing individual code vectors by codes specific thereto, an apparatus for converting an input voice into a vector and determining membership functions by numerically expressing the distance between the nearest code vector and each of the predetermined number of neighboring vectors, and an apparatus for transmitting, as fuzzy vector quantization information, a code of the nearest code vector and the membership functions.
摘要:
Herein disclosed is a speech analysis-synthesis apparatus which resorts to a multi-pulse exciting method using a plurality of modeled pulses as a synthetic sound source if input speech is analyzed so that speech may be synthesized on the basis of the analyzed result. A factor for effecting perpetual weighting in a manner to correspond to the sound source pulse number is made variable, and the error between the input speech and the synthesized speech is perceptually weighted so that the amplitude and location of the train of the sound source pulses are so determined as to minimize said error.
摘要:
A speech recognition method makes it possible to improve the accuracy of recognition of input speech and is capable of operating on a real time basis. This is accomplished by generating from the input speech signal a difference signal which indicates whether the speech power of the input speech is increasing or decreasing for each frame. The similarity between the input speech and a standard pattern is then calculated for each frame, and this is then followed by correcting the similarity calculation on the basis of the generated difference signal and a difference signal relating to the standard pattern obtained from storage. The matching of the input speech and the standard pattern is then effected by using the corrected similarity, and the input speech is then recognized from the result of this matching. Thus, a spectrum matching distance weighted by power information of speech can be obtained in real time.