摘要:
An adaptive Intra-refresh (IR) technique for digital video encoding adjusts IR rate based on video content, or a combination of video content and channel condition. The IR rate may be applied at the frame level or macroblock (MB) level. At the frame level, the IR rate specifies the percentage of MBs to be Intra-coded within the frame. At the MB level, the IR rate defines a statistical probability that a particular MB is to be Intra-coded. The IR rate is adjusted in proportion to a combined metric that weighs estimated channel loss probability, frame-to-frame variation, and texture information. The IR rate can be determined using a close-form solution that requires relatively low implementation complexity. For example, such a close-form does not require iteration or an exhaustive search. In addition, the IR rate can be determined from parameters that are available before motion estimation and compensation are performed.
摘要:
A stereo 3D video frame includes left and right components that are combined to produce a stereo image. For a given amount of distortion, the left and right components may have different impacts on perceptual visual quality of the stereo image due to asymmetry in the distortion response of the human eye. A 3D video encoder adjusts an allocation of coding bits between left and right components of the 3D video based on a frame-level bit budget and a weighting between the left and right components. The video encoder may generate the bit allocation in the rho (ρ) domain. The weighted bit allocation may be derived based on a quality metric that indicates overall quality produced by the left and right components. The weighted bit allocation compensates for the asymmetric distortion response to reduce overall perceptual distortion in the stereo image and thereby enhance or maintain visual quality.
摘要:
Error concealment is used to hide the effects of errors detected within digital video information. A complex error concealment mode decision is disclosed to determine whether spatial error concealment (SEC) or temporal error concealment (TEC) should be used. The error concealment mode decision system uses different methods depending on whether the damaged frame is an intra-frame or an inter-frame. If the video frame is an intra-frame then a similarity metric is used to determine if the intra-frame represents a scene-change or not. If the video frame is an intra-frame, a complex multi-termed equation is used to determine whether SEC or TEC should be used. A novel spatial error concealment technique is disclosed for use when the error concealment mode decision determines that spatial error concealment should be used for reconstruction. The novel spatial error concealment technique divides a corrupt macroblock into four different regions, a corner region, a row adjacent to the corner region, a column adjacent to the corner region, and a remainder main region. Those regions are then reconstructed in that order and information from earlier reconstructed regions may be used in later reconstructed regions. Finally, a macroblock refreshment technique is disclosed for preventing error propagation from harming non-corrupt inter-blocks. Specifically, an inter-macroblock may be ‘refreshed’ using spatial error concealment if there has been significant error caused damage that may cause the inter-block to propagate the errors.
摘要:
Systems, methods, and apparatus described include waveform alignment operations in which a single set of evaluated cosines and sines is used to calculate cross-correlations of two periodic waveforms at two different phase shifts.
摘要:
Methods and apparatus are presented for determining the type of acoustic signal and the type of frequency spectrum exhibited by the acoustic signal in order to selectively delete parameter information before vector quantization. The bits that would otherwise be allocated to the deleted parameters can then be re-allocated to the quantization of the remaining parameters, which results in an improvement of the perceptual quality of the synthesized acoustic signal. Alternatively, the bits that would have been allocated to the deleted parameters are dropped, resulting in an overall bit-rate reduction.
摘要:
A method and apparatus for using coding scheme selection patterns in a predictive speech coder to reduce sensitivity to frame error conditions includes a speech coder configured to select from among various predictive coding modes. After a predefined number of speech frames have been predictively coded, the speech coder codes one frame with a nonpredictive coding mode or a mildly predictive coding mode. The predefined number of frames can be determined in advance from the subjective standpoint of a listener. The predefined number of frames may be varied periodically. An average coding bit rate may be maintained for the speech coder by ensuring that an average coding bit rate is maintained for each successive pattern, or group, of predictively coded speech frames including at least one nonpredictively coded or mildly predictively coded speech frame.
摘要:
A method and apparatus for maintaining a target bit rate in a speech coder includes a speech coder for encoding a frame at a preselected encoding rate, computing a running average bit rate for a predefined number of encoded frames, subtracting the running average bit rate from a predefined target average bit rate, and dividing the difference by the preselected encoding rate. If the quotient value is negative, a predefined number of possible occurrence counts of speech coder performance threshold values that are less than a current performance threshold value is accumulated, the accumulated number being greater than the absolute value of the quotient. The product of a decrement-per-occurrence-count-value and the predefined number of occurrence counts is subtracted from the current performance threshold value to obtain a new performance threshold value. If the quotient value is positive, a predefined number of possible occurrence counts of speech coder performance threshold values that are greater than the current performance threshold value is accumulated, the accumulated number being greater than the quotient. The product of an increment-per-occurrence-count-value and the predefined number of occurrence counts is added to the current performance threshold value to obtain a new performance.
摘要:
A novel and improved method and apparatus for encoding line predictive coding (LPC) data in a speech compression system using line spectral square root values is disclosed. A novel and computationally efficient procedure for determining the set of quantization sensitivities for the line spectral square root values is disclosed, which results in a computationally efficient error measure for use in vector quantization of the line spectral square root values. A novel method of weighting the quantization error is disclosed, which accumulates the quantization error in each line spectral square root value and weights that error by the sensitivity of that line spectral square root value.