摘要:
A method and apparatus for selecting a quantizer scale for each macroblock to maintain the overall quality of the video image while optimizing the coding rate. A quantizer scale is selected for each macroblock such that target bit rate for the picture is achieved while an optimal quantization scale ratio is maintained for successive macroblocks to produce a uniform visual quality over the entire picture. One embodiment applies the method to the frame level while another embodiment applies the method in conjunction with a wavelet transform.
摘要:
A method and apparatus for determining an optimal quadtree structure for quadtree-based variable block size (VBS) motion estimation. The method computes the motion vectors for the entire quadtree from the largest block-size to the smallest block-size. Next, the method may optionally select an optimal quantizer scale for each block. The method then compares from "bottom-up" the sum of the distortion from encoding all sub-blocks or sub-nodes (children) as compared to the distortion from encoding the block or node (parent) from which the subnodes are partitioned from. If the sum of the distortion from encoding the children is greater than that of the parent then the node is "merged". Conversely, if the sum of the distortion from encoding the children is less than that of the parent then the node is "split" and the Lagrangian cost for the parent node is set as the sum of the Lagrangian cost of its children. This step is repeated for the all nodes through every level until an optimal quadtree structure is obtained.
摘要:
A PC-type computer has a system bus (e.g., a PCI bus) configured with a main CPU board, a statistical multiplexing (stat-mux) board, and a plurality of video/audio encoder boards, each configured to receive and compress a corresponding video/audio stream. The stat-mux board performs statistical multiplexing on the different compressed bitstreams to transmit multiple bitstreams over individual shared communication channels. Although each of the boards is configured to the system bus, each encoder board has a digital signal processor (DSP) with a synchronized serial interface (SSI) output port that is directly connected to an SSI input port on a DSP on the stat-mux board (which, in one embodiment, has four such DSPs each with six such SSI input ports). As such, (up to 24) compressed video/audio bitstreams generated on the various encoder boards can be transmitted directly to the stat-mux board without having to go through the system bus. In this way, the computer system can provide statistical multiplexing of low-latency video/audio bitstreams without having to suffer the processing delays associated with conventional transmission over PCI system buses.
摘要:
When two or more different video streams a e compressed for concurrent transmission of multiple compressed video bitstreams over a single shared communication channel, control over both (1) the transmission of data over the shared channel and (2) the compression processing that generates the bitstreams is exercised taking into account the differing levels of latency required for the corresponding video applications. For example, interactive video games typically require lower latency than other video applications such as video streaming, web browsing, and electronic mail. A multiplexer and traffic controller takes these differing latency requirements, along with bandwidth and image fidelity requirements, into account when controlling both traffic flow and compression processing. In addition, an off-line profiling tool analyzes typical video applications off-line in order to generate profiles of different types of video applications that are then accessed in real-time by a call admission manager responsible to controlling the admission of new video application sessions as well as the assignment of admitted applications to specific available video encoders, which themselves may differ in video compression processing power as well as in the degree to which they allow external processors (like the multiplexer and traffic controller) to control their internal compression processing.
摘要:
A method of motion estimation for video encoding constructs a binary pyramid structure having three binary layers. A state update module registers and updates repeat occurrence of final motion vectors and a static-state checking module determines if the method is in a static mode or a normal mode based on the repeat occurrence. In a normal mode, the first binary layer is searched within a ±3 pixel refinement window to determine a first level motion vector. In the second binary layer, a search range is computed based on six motion vector candidates. By checking every point within in the search range, a second binary layer search generates a second level motion vector. Finally, a third binary layer search within a ±2 pixel refinement window generates a final motion vector according to the second level motion vector. In a static mode, a fine tuning module performs search within a ±1 pixel refinement window and generates a final motion vector.
摘要:
A method for video encoding is disclosed. The method generally includes the steps of (A) generating first sub-pel data for at least one of (i) a motion estimation and (ii) a mode decision by first filtering reference data and (B) generating second sub-pel data for a motion compensation by second filtering the reference data. Wherein a first performance of the first filtering may be different than a second performance of the second filtering.
摘要:
An MPEG-4 system with error concealment is provided for video service under the network with packet loss. The MPEG-4 system includes an encoder and a decoder. The encoder uses an intra-refreshment technique is used to make coded bitstream more robust against noise in order to stop error propagation. The rate-distortion optimization criterion is also introduced to adaptively update in synchronization with intra-coded blocks adaptively based on the true network condition with minimal overhead. The Lagrange multiplier is modified to achieve the best rate-distortion balance. In addition, a decoder loop is used in the encoder and is synchronized with the true decoder to achieve the best performance and avoid mismatch with the decoder used in the MPEG-4 system. The decoder is able to achieve resilient decoding from any kind of noise and enhance the reconstructed image quality with spatial and temporal hybrid concealment method. The result shows that a 3.65-9.71 dB further improvement on peak-signal-to-noise-ratio (PSNR) can be achieved in comparison with the existing methods that adopt spatial copy and zero motion concealment in decoding.
摘要:
The present invention proposes a fast motion estimation using N-queen pixel decimation, whereby after a reference block and a block to be processed are selected in a video sequence, an N×N queens pattern is used for pixel decimation to perform block match, thereby obtaining a good enough block difference value. The present invention combines pixel decimation with fast motion estimation for search points reduction to achieve the object of simplifying computational complexity of motion estimation. Therefore, the present invention can sieve out sufficiently representative pixels and will not increase extra computational complexity.
摘要:
The present invention relates to an architecture for stack robust fine granularity scalability (SRFGS), more particularly, SRFGS providing simultaneously temporal scalability and SNR scalability. SRFGS first simplifies the RFGS temporal prediction architecture and then generalizes the prediction concept as the following: the quantization error of the previous layer can be inter-predicted by the reconstructed image in the previous time instance of the same layer. With this concept, the RFGS architecture can be extended to multiple layers that forming a stack to improve the temporal prediction efficiency. SRFGS can be optimized at several operating points to fit the requirements of various applications while the fine granularity and error robustness of RFGS are still remained. The experiment results show that SRFGS can improve the performance of RFGS by 0.4 to 3.0 dB in PSNR.
摘要:
A method for video encoding is disclosed. The method generally includes the steps of (A) generating first sub-pel data for at least one of (i) a motion estimation and (ii) a mode decision by first filtering reference data and (B) generating second sub-pel data for a motion compensation by second filtering the reference data. Wherein a first performance of the first filtering may be different than a second performance of the second filtering.