摘要:
A speech encoding method and apparatus including analyzing, using a codebook expressing speech parameters within a predetermined search range, an input speech signal in an audibility weighting filter corresponding to a pitch period longer than the search range of the codebook, and searching, from the codebook, on the basis of the analysis result, a combination of speech parameters by which the distortion of the input speech signal is minimized, and encoding the combination. The apparatus uses an adaptive codebook of pitch and a noise codebook. The codebooks search a group formed by extracting vectors of predetermined length from one original code vector, while sequentially shifting position so that the vectors overlap each other. The search group is further restricted and another preselection is made before the final search. Search is based on inversely convoluted, orthogonally transformed vectors.
摘要:
A speech encoding method, apparatus and program wherein an input speech signal is divided into a plurality of frames each having a predetermined length, each of the frames is subdivided into a plurality of subframes, a predictive pitch period of a subframe in a to-be-encoded current frame is obtained by using pitch periods of at least two frames of the current frame and past and future frames with respect to the current frame; a pitch period of a subframe in the current frame is obtained by using the predictive pitch period, a relative pitch pattern codebook storing a plurality of relative pitch patterns representing fluctuations in pitch periods of a plurality of subframes is prepared, and a change in pitch period of plural subframes is expressed with one relative pitch pattern selected from the relative pitch pattern codebook.
摘要:
A method for encoding speech wherein an input speech signal is separated by a component separator into a first component mainly constituted by speech and a second component mainly constituted by a background noise at each predetermined unit of time, a bit allocation selector selects bit allocation for each component based on the first and second components from among a plurality of predetermined candidates for bit allocation, a speech encoder and a noise encoder encode the first and second components from the component separator based on the bit allocation according to predetermined different methods for encoding, and a multiplexer multiplexes encoded data of the first and second components and information on the bit allocation and outputs them as transmitted encoded data.
摘要:
In a background noise/speech classification method, whether a digital input signal input through an input terminal is background noise or speech is decided by a background noise/speech decision section on the basis of calculated frame power and a calculated LSP coefficient which are obtained by supplying the input signal to a feature amount calculation section and estimated frame power and an estimated LSP coefficient obtained by an estimated feature amount update section. Thereafter, the estimated feature amount update section updates the estimated frame power and the estimated LSP coefficient by using the frame power and the LSP coefficient obtained by the feature amount calculation section to prepare for the next frame.
摘要:
Adjusting the shape of a spectrum of a speech signal includes the steps of using a first filter with pole-zero transfer function A(z)/B(z) for subjecting a speech signal to a spectrum envelop emphasis and a second filter cascade-connected with the first filter, for compensating for a spectral tilt due to the first filter, independently deriving two filter coefficients used in the second filter for compensating for the spectral tilt from the pole-zero transfer function, and compensating for the spectral tilt corresponding to the pole-zero transfer function according to the derived filter coefficients.
摘要:
A vector quantizing apparatus includes a first search section for obtaining an approximate vector X1 which is approximated to a desired vector R, a residual vector calculator for calculating a residual vector Rv from the desired vector R and the approximate vector X1, a weighting section for obtaining weighted vectors X2 to XN of code vectors x2 to xN, and a second search section for calculating an estimation value which is the magnitude of a projection vector of the residual vector Rv with respect to the vector space formed by the approximate vector X1 and the weighted vectors X2 to XN, and searching a code vector which maximizes this estimation value.
摘要:
A speech encoding method, apparatus and program wherein an input speech signal is divided into a plurality of frames each having a predetermined length, each of the frames is subdivided into a plurality of subframes, a predictive pitch period of a subframe in a to-be-encoded current frame is obtained by using pitch periods of at least two frames of the current frame and past and future frames with respect to the current frame; a pitch period of a subframe in the current frame is obtained by using the predictive pitch period, a relative pitch pattern codebook storing a plurality of relative pitch patterns representing fluctuations in pitch periods of a plurality of subframes is prepared, and a change in pitch period of plural subframes is expressed with one relative pitch pattern selected from the relative pitch pattern codebook.
摘要:
A method for encoding speech wherein an input speech signal is separated by a component separator into a first component mainly constituted by speech and a second component mainly constituted by a background noise at each predetermined unit of time, a bit allocation selector selects bit allocation for each component based on the first and second components from among a plurality of predetermined candidates for bit allocation, a speech encoder and a noise encoder encode the first and second components from the component separator based on the bit allocation according to predetermined different methods for encoding, and a multiplexer multiplexes encoded data of the first and second components and information on the bit allocation and outputs them as transmitted encoded data.
摘要:
In a formant emphasis method of emphasizing the formant as the spectral peak of an input speech signal and attenuating the spectral valley of the input speech signal, a spectrum emphasis filter performs processing for emphasizing the formant of the input speech signal and attenuating the valley of the input speech signal. A first-order variable characteristic filter whose characteristic adaptively changes in accordance with the characteristic of the input speech signal and a first-order fixed characteristic filter compensate a spectral tilt included in an output signal from the spectrum emphasis filter.
摘要:
The coding apparatus comprises an adaptive codebook storing excitation signals as vectors, a synthesis filter for forming a synthesis signal, referring to the vectors stored in the adaptive codebook, a similarity computation circuit for computing a similarity between the synthesis signal obtained by the synthesis filter and a target signal, and a coding scheme determining circuit for deciding one coding scheme from a plurality of coding schemes respectively having coding bit rates different from each other, on the basis of the similarity obtained by the similarity computation circuit.