摘要:
A method and system for encoding and decoding an input signal, wherein the input signal is divided into a higher frequency band and a lower frequency band in the encoding and decoding processes, and wherein the decoding of the higher frequency band is carried out by using an artificial signal along with speech related parameters obtained from the lower frequency band. In particular, the artificial signal is scaled before it is transformed into an artificial wideband signal containing colored noise in both the lower and the higher frequency band. Additionally, voice activity information is used to define speech periods and non-speech periods of the input signal. Based on the voice activity information, different weighting factors are used to scale the artificial signal in speech periods and non-speech periods.
摘要:
A method and system for providing comfort noise in the non-speech periods in speech communication. The comfort noise is generated based on whether the background noise in the speech input is stationary or non-stationary. If the background noise is non-stationary, a random component is inserted in the comfort noise using a dithering process. If the background noise is stationary, the dithering process is not used.
摘要:
Focused error correction and/or focused error detection is used in the information coding system. A speech encoding method, in which the number of speech parameter bits on which error correction coding and/or error detection coding focuses is automatically adjusted in relation to the number of total speech parameter bits as a function of the quality of the information transfer connection. There is no need to reduce the number of bits used for speech encoding. Thus the voice quality of the speech remains high. The error correction and/or error detection is focused on the bits most important for the voice quality e.g., as a function of the C/I (Channel to Interference)13 parameter describing the quality of the information transfer connection. The muting of speech synthesizing occuring in prior systems on poor information transfer connection is reduced by using focused error detection.
摘要:
The present invention relates to processing speech coding parameters in a telecommunication system. The speech coding parameters of a speech frame, produced by a speech encoder, are divided into groups, i.e. so-called virtual channels, in which speech parameter error correction, channel coding and processing of error-free or erroneous speech parameters are performed independently. At the receiving end, the processing (505) of erroneous and error-free speech parameters can thus be controlled independently on each virtual transmission channel (502) according to the quality of each virtual transmission channel. The speech parameters of the high-quality virtual channels of a speech frame can thus be processed as error-free, replacing the speech coding parameters of the low-quality virual channels only. The independently processed (505) speech parameters of the virtual channels are thus reassembled (507) into a speech frame, which is applied to decoding. Since part of the information of also erroneous speech frames is utilized, the use of speech information received from a transmission channel can be increased in speech decoding, which reduces for instance interruptions occurring in speech as compared with a situation where all speech frames erroneous even to a slight degree were discarded. The increased and more focused error indication also reduces the number of undetected errors and thus reduces significantly the worst audible disturbances.
摘要:
A speech decoder comprises a decoder (103) for converting a linear prediction encoded speech signal into a first sample stream having a first sampling rate and representing a first frequency band. Additionally it comprises a vocoder (105) for converting an input signal into a second sample stream having a second sampling rate and representing a second frequency band, and combination means (107) for combining the first and second sample streams in processed form. It comprises also means (301) for generating a second linear prediction filter, to be used by the vocoder (105) on the second frequency band, on the basis of a first linear prediction filter used by the decoder (103) on the first frequency band. Extrapolation through an infinite impulse response filter is the preferable method of generating the second linear prediction filter.
摘要:
The present invention provides, methods, computer-readable media, and apparatuses for tuning and adjusting the computational complexity of algorithm that is executed by a signal encoder. The signal encoder may comprise a speech encoder. When a resource shortage on a computer platform is detected, a degree of the resource shortage and a corresponding complexity adjustment for a speech encoder are determined. The speech encoder is then tuned to adjust the computational complexity of an executed speech processing algorithm. The resource shortage may correspond to a computational capability, audio buffer memory, or battery of a mobile device. A speech process being executed by the mobile device is tuned to adjust the computational demands in accordance with a complexity adjustment. A number of iteration rounds may be adjusted while the speech encoder is executing a speech processing algorithm. The iterations may correspond to an algebraic codebook search.
摘要:
The invention relates to a support of a concatenative TTS synthesis. In order to generate a speech database as a basis for the TTS synthesis, first, a speech processing including a segmental parametric speech encoding of speech data based on a parametric modeling of speech is performed, which results in compressed parameterized speech segments. Then, the compressed parameterized speech segments are assembled in a speech database. In order to synthesize output speech, compressed parameterized speech segments are selected from the speech database based on an available text and decompressed to regain parameterized speech segments. The parameterized speech segments are then concatenated in a parameter domain. The output speech is synthesized based on these concatenated parametric speech segments.
摘要:
According to an aspect of the invention, an enhanced audible feedback solution has been invented for electronic devices using an input device facilitating navigation though a plurality of available user interface input options and confirmation of a selected input option. The electronic device is arranged to define, as a response to detecting a selection of a character on the basis of a detection of a first input to an input device of the electronic device, an audio segment specific to the character. The electronic device is arranged to output the defined audio segment via the audio output means prior to a confirmation by a second input to the input device, the second input being associated with a function adding the character as part of a character sequence entered by the user.
摘要:
A method for use by a speech decoder in handling bad frames received over a communications channel a method in which the effects of bad frames are concealed by replacing the values of the spectral parameters of the bad frames (a bad frame being either a corrupted frame or a lost frame) with values based on an at least partly adaptive mean of recently received good frames, but in case of a corrupted frame (as opposed to a lost frame), using the bad frame itself if the bad frame meets a predetermined criterion. The aim of concealment is to find the most suitable parameters for the bad frame so that subjective quality of the synthesized speech is as high as possible.
摘要:
A method and corresponding apparatus for encoding a sequence of bits for transmission as symbols, some of the bit positions of the symbols having a higher bit error rate than other bit positions. A plurality of sequences of bits is provided using a convolutional encoder, in response to a sequence of input bits, each sequence of bits being defined by a predetermined generator polynomial having a predetermined level of sensitivity to puncturing. Then the bits of each sequence of bits are mapped to symbol positions based on the level of sensitivity of the generator polynomial defining the sequence of bits. With interleaving, the mapping of bits of each sequence of bits to symbol positions can precede a symbol interleaving step, or it can follow a bit interleaving step.