摘要:
Optimal speech coding is conducted in response to inputted speech. In an adaptation code book (Ba) are stored signal vector sequences of past speech signals. Vector extracting means (11) extracts a signal vector and neighborhood vectors stored near the signal vector. A high-order long-time prediction synthetic filter (12) generates a long-time prediction speech signal (Sna-1) from the signal vector and the neighborhood signal vectors. Filter coefficient calculating means (13) calculates the filter coefficient of the long-time prediction synthetic filter (12). An audition weighting synthetic filter (14) generates a reproduction coded speech signal (Sna) from the long-time prediction speech signal (Sna-1). Error calculating means (15) calculates the error (En) between the speech signal (Sn) and the reproduction coded speech signal (Sna). Minimum error detecting means (16) detects the minimum error from among the calculated errors. Optimal value transmitting means (17) transmits the optimal filter coefficient ( beta a) and the optimal delay (La) when the minimum value is detected as optimal values.
摘要:
An audio encoding apparatus and an audio encoding method, wherein the processing amount can be reduced and a block length can be appropriately selected. An electric power calculation part (402) calculates, from an input signal, an electric power variation ratio. A predicted gain variation ratio calculation part (406) calculates, from the input signal, a predicted gain variation ratio. A block length determination part (407) determines, from the electric power variation ratio and the predicted gain variation ratio, whether an encoding using a long block is to be executed or an encoding using a short block is to be executed. Based on this determination, a long block MDCT conversion part (409) or a short block MDCT conversion part (410) performs a discrete cosine transformation of the input signal.
摘要:
[Object] To provide high-quality sound while suppressing the calculation amount [Solution Means] A sound processing device includes a first calculation unit configured to calculate a suppression gain of noise by using respective input signals input from a plurality of microphones; an integration unit configured to obtain an integration gain by using a suppression gain of an acoustic echo and the suppression gain of the noise; an application unit configured to apply the integration gain to one input signal among the plurality of input signals; and a second calculation unit configured to calculate the suppression gain of the acoustic echo by using signals to which the integration gain is applied, output signals that are output to a replay device, and the one input signal.
摘要:
Disclosed is a voice encoding method having a synthesis filter implemented using linear prediction coefficients obtained by dividing an input signal into frames each of a fixed length, and subjecting the input signal to linear prediction analysis in the frame units, generating a reconstructed signal by driving said synthesis filter by a periodicity signal output from an adaptive codebook and a pulsed signal output from an algebraic codebook, and performing encoding in such a manner that an error between the input signal and said reproduced signal is minimized, wherein there are provided an encoding mode 1 that uses pitch lag obtained from an input signal of a present frame and an encoding mode 2 that uses pitch lag obtained from an input signal of a past frame. Encoding is performed in encoding mode 1 and encoding mode 2, the mode in which the input signal can be encoded more precisely is decided frame by frame and encoding is carried out on the basis of the mode decided.
摘要:
A voice intensifier capable of reducing abrupt changes in the amplification factor between frames and realizing excellent sound quality with less noise feeling by dividing input voices into the sound source characteristic and the vocal tract characteristic, so as to individually intensify the sound source characteristic and the vocal tract characteristic and then synthesize them before being output. The voice intensifier comprises a signal separation unit for separating the input sound signal into the sound source characteristic and the vocal tract characteristic, a characteristic extraction unit for extracting characteristic information from the vocal tract characteristic, a corrective vocal tract characteristic calculation unit for obtaining vocal tract characteristic correction information from the vocal tract characteristic and the characteristic information, a vocal tract characteristic correction unit for correcting the vocal tract characteristic by using the vocal tract characteristic correction information, and a signal synthesizing means for synthesizing the corrective vocal tract characteristic from the vocal tract characteristic correction unit and the sound source characteristic, so that the sound synthesized by the signal synthesizing means is output.
摘要:
Disclosed is a decoding apparatus for decoding corded data obtained by encoding each of a scale value and a spectrum value of frequency domain audio signal data to output an audio signal. The decoding apparatus includes a unit configured to decode and inversely quantize the coded data to obtain the frequency domain audio signal data, a unit configured to compute from the coded data one of the number of scale bits composed of the number of bits corresponding to the scale value of the coded data and the number of spectrum bits composed of the number of bits corresponding to the spectrum value of the coded data, a unit configured to estimate a quantization error of the frequency domain audio signal data based on one of the number of scale bits and the number of spectrum bits of the coded data, a unit configured to compute a correction amount based on the estimated quantization error and correct the frequency domain audio signal data obtained by the frequency domain data obtaining unit based on the computed correction amount, and a unit configured to convert the corrected frequency domain audio signal data into the audio signal.
摘要:
It is possible to reduce audio quality degradation caused by a pre-echo and bit shortage. An acoustic analysis unit (11) analyzes an audio signal and acquires a perception entropy as a parameter expressing the number of bits required for quantization. An encoding bit quantity monitoring unit (12) monitors the number of encoded bits when an audio signal is encoded and acquires an excessive number of bits as the number of bits which can be used in the current frame. According to a combination of the perception entropy and the excessive number of bits, a frame division quantity decision unit (13) decides the division quantity for dividing the one frame of the audio signal into N from 1 to N. An orthogonal conversion unit (14) divides the one frame by the decided division quantity and performs orthogonal conversion of the audio signal by the divided block length unit so as to obtain an orthogonal conversion coefficient. A quantization unit (15) quantizes the orthogonal conversion coefficient by the block length unit.