摘要:
In order to efficiently retrieve AV data by using signal characteristics as retrieval conditions, in a first step, a comparison and determination section computes a correlation coefficient (degree of similarity) of a spectrum coefficient of coded audio data and a spectrum coefficient of a sample waveform, and extracts correlation coefficients such that the value of the computed spectrum coefficient is larger than a threshold value which is set in the first step, and assumes them to be retrieval results. In a second step, the comparison and determination section determines whether or not the retrieval result is satisfactory. When it is determined that the number of pieces of audio data retrieved in the first step is equal to or greater than the predetermined threshold value and the retrieval result is not satisfactory, the process proceeds to a third step. In the third step, the comparison and determination section determines whether or not the number of frequency bands of the sample waveform, which is the retrieval conditions, is less than its maximum value. When it is determined that the number of frequency bands is less than its maximum value, in a fourth step, the number of frequency bands of the waveform signal, which is the retrieval conditions, is incremented by 1, and the process returns to the first step.
摘要:
The present invention relates to a decoding apparatus, a decoding method, an encoding apparatus, an encoding method, and programs that can shorten the delay time caused by the band extension at the time of decoding, and restrain increases in resources on the decoding side.A higher frequency component generating unit (73) generates a pseudo higher frequency spectrum by using a lower frequency spectrum (SP-L) and a higher frequency envelope (ENV-H). A phase randomizing unit (74) randomizes the phase of the pseudo higher frequency spectrum, based on a random flag (RND). An inverse MDCT unit (75) denormalizes the lower frequency spectrum (SP-L) by using a lower frequency envelope (ENV-L), and combines the pseudo higher frequency spectrum supplied from the phase randomizing unit (74) with the denormalized lower frequency spectrum (SP-L). The combination result is used as the spectrum of the entire band. The present invention can be applied to a decoding apparatus that performs band extension decoding, for example.
摘要:
An information coding apparatus includes a predictive signal generator that generates a predictive signal; a predictive residual signal generator that generates a predictive residual signal; a quantizer that quantizes a quantization input signal generated based on the predictive residual signal; a quantization error signal generator that generates a quantization error signal; a feedback signal generator that generates a feedback signal for controlling the frequency characteristic of the quantization noise after decoding based on the quantization error signal; and a quantization input signal generator that generates the quantization input signal. The feedback signal generator is configured by a pole-zero filter that includes a filter coefficient of an all-pole filter which is based on spectral envelope information estimated by the input audio signal, a parameter for adjusting a peak level in the frequency characteristic of the quantization noise caused by the all-pole filter, and the predictive filter coefficient.
摘要:
A decoding device includes an acquisition unit configured to acquire a first frequency signal including a narrowband signal and a wideband signal, a direct inverse orthogonal transform unit configured to perform a direct matrix operation with respect to the narrowband signal of the first frequency signal so as to perform inverse orthogonal transform, and a high-speed inverse orthogonal transform unit configured to perform inverse orthogonal transform employing a high-speed operation method with respect to the wideband signal of the first frequency signal.
摘要:
A speaker of encoded speech data recorded in a semiconductor storage device in an IC recorder is to be retrieved easily. An information receiving unit 10 in a speaker retrieval apparatus 1 reads out the encoded speech data recorded in a semiconductor storage device 107 in an IC recorder 100. A speech decoding unit 12 decodes the encoded speech data. A speaker frequency detection unit 13 discriminates the speaker based on a feature of the speech waveform decoded to find the frequency of conversation (frequency of occurrence) of the speaker in a preset time interval. A speaker frequency graph displaying unit 14 displays the speaker frequency on a picture as a two-dimensional graph having time and the frequency as two axes. A speech reproducing unit 16 reads out the portion of the encoded speech data corresponding to a time position or a time range specified by a reproducing position input unit 15 based on this two-dimensional graph from the storage device 11 and decodes the read-out data to output the decoded data to a speech outputting unit 17.
摘要:
A process of identifying a speaker in coded speech data and a process of searching for the speaker are efficiently performed with fewer computations and with a smaller storage capacity. In an information search apparatus, an LSP decoding section extracts and decodes only LSP information from coded speech data which is read for each block. An LPC conversion section converts the LSP information into LPC information. A Cepstrum conversion section converts the obtained LPC information into an LPC Cepstrum which represents features of speech. A vector quantization section performs vector quantization on the LPC Cepstrum. A speaker identification section identifies a speaker on the basis of the result of the vector quantization. Furthermore, the identified speaker is compared with a search condition in a condition comparison section, and based on the result, the search result is output.
摘要:
A decoding device includes an acquisition unit configured to acquire a first frequency signal including a narrowband signal and a wideband signal, a direct inverse orthogonal transform unit configured to perform a direct matrix operation with respect to the narrowband signal of the first frequency signal so as to perform inverse orthogonal transform, and a high-speed inverse orthogonal transform unit configured to perform inverse orthogonal transform employing a high-speed operation method with respect to the wideband signal of the first frequency signal.
摘要:
A decoding device including a decoding unit which decodes encoded data, an inverse orthogonal transformation unit which performs inverse orthogonal transformation for the encoded data and obtains a time series waveform element in a unit of blocks, a correlation calculation unit which obtains a correlation between a time series waveform element of a block arranged immediately before an error block which is a block in which an error has occurred during decoding by the decoding unit and a time series waveform element of a block arranged a predetermined number of blocks before the block, a cycle calculation unit which obtains a basic cycle of a block unit of the error block based on the correlation obtained by the correlation calculation unit, and a generation unit which generates a substitute signal of the time series waveform element of the error block.
摘要:
A method and apparatus for encoding audio data and a method and apparatus for decoding audio data, which can generate and decode, respectively, scalable lossless streams and which can shorten the time necessary to generate and decode lossless streams. A lossy-core encoder unit performs lossy compression on an input audio signal, generating a core stream. A simplified lossy-core decoding unit decodes only spectral signals of a specified band, e.g., a lower frequency band to generate a lossy decoded audio signal. A subtracter subtracts a lossy decoded audio signal from the input audio signal delayed to generate a residual signal. A rounding-off unit performs a process of rounding off the number of bits constituting the residual signal by eliminating the residual sign bit without loss of precision. A lossless-enhance encoder unit performs lossless compression on the residual signal to generate an enhanced stream. A stream-combining unit combines the core stream and the enhanced stream to generate a scalable lossless stream.
摘要:
The present invention relates to an encoding device and an encoding method, a decoding device and a decoding method, and a program that reduce deterioration of sound quality due to encoding of audio signals.An envelope emphasis part (51) emphasizes an envelope (ENV). A noise shaping part (52) divides an emphasized envelope (D) formed by emphasis of the envelope (ENV) by a value larger than 1, and subtracts noise shaping (G) specified by information (NS) from a result of the division. A quantization part (14) sets a result of the subtraction as a quantization bit count (WL), and quantizes a normalized spectrum (S1) formed by normalization of a spectrum (S0) based on the quantization bit count (WL). A multiplexing part (53) multiplexes the information (NS), a quantized spectrum (QS) formed by quantization of the normalized spectrum (S1), and the envelope (ENV). The present invention can be applied to an encoding device encoding audio signals, for example.