摘要:
A system and method for locating a preferable playback start location after a winding or rewinding action in an audio playing device. In response to an adjustment of the playing location for audio content to a desired playing position, the system determines whether at least one non-speech or silent period of at least a predetermined duration exists within the vicinity of the desired playing position. If at least one such non-speech or silent period exists within the vicinity of the desired playing position, the system adjusts the playing position to fall within one of the at least one non-speech period or silent period.
摘要:
A method and system for concealing errors in one or more bad frames in a speech sequence as part of an encoded bit stream received in a decoder. When the speech sequence is voiced, the LTP-parameters in the bad frames are replaced by the corresponding parameters in the last frame. When the speech sequence is unvoiced, the LTP-parameters in the bad frames are replaced by values calculated based on the LTP history along with an adaptively-limited random term.
摘要:
A method of determining a codec mode for encoding a frame in a communications system, the method comprising the steps of: receiving a sequence of signal samples arranged in frames; analysing a current frame to select a codec mode appropriate for the current frame; predicting the characteristics of a subsequent frame using lookahead samples from the subsequent frame; and determining a codec mode for the current frame and the subsequent frame which suits the current frame and also suits a subsequent frame based on the predicted characteristics.
摘要:
A method of determining a codec mode for encoding a frame in a communications system, the method comprising the steps of: receiving a sequence of signal samples arranged in frames; analysing a current frame to select a codec mode appropriate for the current frame; predicting the characteristics of a subsequent frame using lookahead samples from the subsequent frame; and determining a codec mode for the current frame and the subsequent frame which suits the current frame and also suits a subsequent frame based on the predicted characteristics.
摘要:
A method of encoding speech in a communications system includes the steps of receiving a speech signal including voice signals and background signals, and detecting voice activity and providing an indicator when no voice activity is detected. The speech signal is encoded to generate a plurality of parameters representing the signal. When the indicator is not present, a first parametric representation of the speech signal is output, including the plurality of parameters. When the indicator is present, at least one of the plurality of parameters is modified and a second parametric representation of the speech signal, including the modified parameter is output.
摘要:
This invention is related to tandem free operation (TFO) in mobile cellular systems. The present invention implements a tandem free operation by using a special feedback loop which makes the decoded parameters available, performs the comfort noise insertion and bad frame handling operations, produces the parameter quantisation indices corresponding to the output of these operations, and synchronises the speech encoders and the speech decoders in the transmission path from the uplink mobile station to the downlink mobile station. This functionality is realized by partly decoding and re-encoding the parameters and synchronising and resetting the quantiser prediction memories in a specific manner. A basic idea of the invention is, that during BFH and CNI processes, a re-encoding block produces models of encoded speech parameters from the BFH/CNI processed speech parameters. These models of encoded speech parameters are then transmitted to the receiving end. The present invention provides a solution to the problem created by predictive, more generally non-stateless encoders in TFO operation.
摘要:
The invention relates to a method for transmitting background noise information including a silence descriptor identifier and background noise parameters in a communication system in which the information to be transmitted is formed into data frames. The data frames are subjected to channel coding to form channel-coded frames. The channel-coded frames are interleaved to be transmitted in two or more data transmission frames, and information of two channel-coded frames is transmitted in each data transmission frame. A first silence descriptor frame is formed provided with the silence descriptor identifier. The first silence descriptor frame is subjected to channel coding to form a channel-coded silence descriptor frame. The channel-coded silence descriptor frame is transmitted in two or more data transmission frames, and at least one data transmission frame transmitting part of the channel-coded silence descriptor frame is also used to transmit at least the background noise parameters.
摘要:
A speech coding method and device for encoding and decoding an input signal and providing synthesized speech, wherein the higher frequency components of the synthesized speech are achieved by high-pass filtering and coloring an artificial signal to provide a processed artificial signal. The processed artificial signal is scaled by a first scaling factor during the active speech periods of the input signal and a second scaling factor during the non-active speech periods, wherein the first scaling factor is characteristic of the higher frequency band of the input signal and the second scaling factor is characteristic of the lower frequency band of the input signal. In particular, the second scaling factor is estimated based on the lower frequency components of the synthesized speech and the coloring of the artificial signal is based on the linear predictive coding coefficients characteristic of the lower frequency of the input signal.
摘要:
A method of speech coding a sampled speech signal using long term prediction (LTP). A LTP pitch-lag parameter is determined for each frame of the speech signal by first determining the autocorrelation function for the frame within the signal, between predefined maximum and minimum delays. The autocorrelation function is then weighted to emphasize the function for delays in the neighborhood of the pitch-lag parameter determined for the most recent voiced frame. The maximum value for the weighted autocorrelation function is then found and identified as the pitch-lag parameter for the frame.