摘要:
Input audio signal is divided on a block-by-block basis. Frequency domain conversion is done on each of the blocks. Voiced bands of the frequency domain data for one of the blocks are searched for a voiced band B.sub.VH with the highest center frequency if it is decided that there are one or more shift points of voiced (V)/unvoiced (UV) decision data of all the bands. The number N.sub.V of voiced bands having center frequency less than that of the band B.sub.VH is found, so as to decide whether a proportion of the voiced bands is equal to or higher than a predetermined threshold N.sub.th, thereby deciding one V/UV boundary point. Thus, it is possible to replace the V/UV decision data for each band by information on one demarcation in all bands, thereby reducing data volume and bit rate.
摘要:
A compressed digital speech signal is encoded to provide a transmission error-resistant transmission signal. The compressed speech signal is derived from a digital speech signal by performing a pitch search on a block obtained by dividing the speech signal in time to provide pitch information for the block. The block of the speech signal is orthogonally transformed to provide spectral data, which is divided by frequency into plural bands in response to the pitch information. A voiced/unvoiced sound discrimination generates voiced/-unvoiced (V/UV) information indicating whether the spectral data in each of the plural bands represents a voiced or an unvoiced sound. The spectral data in the plural bands are interpolated to provide spectral amplitudes for a predetermined number of bands, independent of the pitch. Hierarchical vector quantizing is applied to the spectral amplitudes to generate upper-layer indices, representing an overview of the spectral amplitudes, and lower-layer indices, representing details of the spectral amplitudes. CRC error detection coding is applied to the upper-layer indices, the pitch information, and the V/UV information to generate CRC codes. Convolution coding for error correction is applied to the upper-layer indices, the higher-order bits of the lower-layer indices, the pitch information, the V/UV information, and the CRC codes. The convolution-coded quantities from two blocks of the speech signal are then interleaved in a frame of the transmission signal, together with the lower-order bits of the respective lower-layer indices.
摘要:
A high efficiency encoding method for encoding data on frequency axis obtained by dividing an input audio signal on block-by-block basis and converting the signal onto the frequency axis, wherein V bands are searched for a band B.sub.VH with the highest center frequency if it is decided that there are one or more shift points of voiced (V)/unvoiced (UV) decision data of all bands on the frequency axis, and wherein the number of V bands N.sub.V up to the band B.sub.VH is found, so as to decide whether proportion of the V bands is equal to or higher than a predetermined threshold N.sub.th, thereby deciding one V/UV boundary point. Thus, it is possible to replace the V/UV decision data for each band by information on one demarcation in all bands, thereby to reduce data volume and to reduce bit rate. Also, by using two-stage hierarchical vector quantization in quantizing the data on the frequency axis, operation volume for codebook search and memory capacity of the codebook are reduced.
摘要:
A high efficiency encoding method for encoding data on frequency axis obtained by dividing an input audio signal on block-by-block basis and converting the signal onto the frequency axis, wherein V bands are searched for a band B.sub.VH with the highest center frequency if it is decided that there are one or more shift points of voiced (V)/unvoiced (UV) decision data of all bands on the frequency axis, and wherein the number of V bands N.sub.V up to the band B.sub.VH is found, so as to decide whether proportion of the V bands is equal to or higher than a predetermined threshold N.sub.th, thereby deciding one V/UV boundary point. Thus, it is possible to replace the V/UV decision data for each band by information on one demarcation in all bands, thereby to reduce data volume and to reduce bit rate. Also, by using two-stage hierarchical vector quantization in quantizing the data on the frequency axis, operation volume for codebook search and memory capacity of the codebook are reduced.
摘要:
A speech encoding method and apparatus in which an input speech signal is divided in terms of blocks or frames as encoding units and encoded in terms of the encoding units, whereby explosive and fricative consonants can be impeccably reproduced, while there is an attenuation of the occurrence of foreign sounds being generated at a transient portion between voiced (V) and unvoiced (UV) portions, so that the speech with high clarity devoid of “stuffed” feeling may be produced. The encoding apparatus includes a first encoding unit for finding residuals of linear predictive coding (LPC) of an input speech signal for performing harmonic coding and a second encoding unit for encoding the input speech signal by waveform coding. The first encoding unit and the second encoding unit are used for encoding a voiced (V) portion and an unvoiced (UV) portion of the input signal, respectively. Code excited linear prediction (CELP) encoding employing vector quantization by a closed loop search of an optimum vector using an analysis-by-synthesis method is used for the second encoding unit. A corresponding decoding method and apparatus is also provided.
摘要:
An audio signal processing method for repairing an anomalous state such as noise, a discontinuity, and a break of sound, comprising detecting the anomalous state of an audio signal, deleting the audio signal in the anomalous segment, deducing the correct audio signal by referring to the waveform of the audio signal before and after the deleted segment, generating a repair signal for repairing the signal in the deleted segment based on the deduced result, inserting the repair signal into the deleted segment, and connecting it to the audio signal before and after the deleted segment.
摘要:
An audio signal processing method for repairing an anomalous state such as noise, a discontinuity, and a break of sound, comprising detecting the anomalous state of an audio signal, deleting the audio signal in the anomalous segment, deducing the correct audio signal by referring to the waveform of the audio signal before and after the deleted segment, generating a repair signal for repairing the signal in the deleted segment based on the deduced result, inserting the repair signal into the deleted segment, and connecting it to the audio signal before and after the deleted segment.
摘要:
An audio signal processing method for repairing an anomalous state such as noise, a discontinuity, and a break of sound, comprising detecting the anomalous state of an audio signal, deleting the audio signal in the anomalous segment, deducing the correct audio signal by referring to the waveform of the audio signal before and after the deleted segment, generating a repair signal for repairing the signal in the deleted segment based on the deduced result, inserting the repair signal into the deleted segment, and connecting it to the audio signal before and after the deleted segment.
摘要:
An apparatus and a method for encoding an input signal on the time base through orthogonal transform involves removing the correlation of signal waveform based on parameters obtained by linear predictive coding (LPC) analysis and pitch analysis of the input signal on the time base prior to the orthogonal transform. A normalization circuit section removes the correlation of the signal waveform and takes out the residue by an LPC inverse filter and pitch inverse filter and sends the residue to an orthogonal transform circuit section. The LPC parameters and the pitch parameters are sent to a bit allocation calculation circuit. A coefficient quantization section quantizes the coefficients from the orthogonal transform circuit section according to the number of allocated bits from the bit allocation calculation section.
摘要:
An apparatus and a method for encoding an input signal on the time base through orthogonal transform involves removing the correlation of signal waveform on the basis of the parameters obtained by means of linear predictive coding (LPC) analysis and pitch analysis of the input signal on the time base prior to the orthogonal transform. The time base input signal from input terminal is sent to a normalization circuit section and a LPC analysis circuit. The normalization circuit section removes the correlation of the signal waveform and takes out the residue by an LPC inverse filter and pitch inverse filter and sends the residue to an orthogonal transform circuit section. The LPC parameters from the LPC analysis circuit and the pitch parameters from the pitch analysis circuit are sent to a bit allocation calculation circuit. A coefficient quantization section quantizes the coefficients from the orthogonal transform circuit section according to the number of allocated bits from the bit allocation calculation section.