摘要:
An accelerated video decoding system utilizes a graphics processing unit to perform motion compensation, image reconstruction, and color space conversion processes, while utilizing a central processing unit to perform other decoding processes.
摘要:
A video encoding system and method utilizes a three-dimensional (3-D) wavelet transform and entropy coding that utilize motion information in a way to reduce the sensitivity to motion. In one implementation, the coding process initially estimates motion trajectories of pixels in a video object from frame to frame in a video sequence to account for motion of the video object throughout the frames. After motion estimation, a wavelet transform is applied to produce coefficients within different sub-bands. The wavelet coefficients are coded independently for each sub-band to permit easy separation at a decoder, making resolution scalability and temporal scalability natural and easy. In particular, the coefficients are assigned various contexts based on the significance of neighboring samples in previous, current, and next frame, thereby taking advantage of any motion information between frames.
摘要:
A scalable layered video coding scheme that encodes video data frames into multiple layers, including a base layer of comparatively low quality video and multiple enhancement layers of increasingly higher quality video, adds error resilience to the enhancement layer. Unique resynchronization marks are inserted into the enhancement layer bitstream in headers associated with each video packet, headers associated with each bit plane, and headers associated with each video-of-plane (VOP) segment. Following transmission of the enhancement layer bitstream, the decoder tries to detect errors in the packets. Upon detection, the decoder seeks forward in the bitstream for the next known resynchronization mark. Once this mark is found, the decoder is able to begin decoding the next video packet. With the addition of many resynchronization marks within each frame, the decoder can recover very quickly and with minimal data loss in the event of a packet loss or channel error in the received enhancement layer bitstream. The video coding scheme also facilitates redundant encoding of header information from the higher-level VOP header down into lower level bit plane headers and video packet headers. Header extension codes are added to the bit plane and video packet headers to identify whether the redundant data is included.
摘要:
A video encoding system and method utilizes a three-dimensional (3-D) wavelet transform and entropy coding that utilize motion information in a way to reduce the sensitivity to motion. In one implementation, the coding process initially estimates motion trajectories of pixels in a video object from frame to frame in a video sequence to account for motion of the video object throughout the frames. After motion estimation, a 3-D wavelet transform is applied in two parts. First, a temporal 1-D wavelet transform is applied to the corresponding pixels along the motion trajectories in a time direction. The temporal wavelet transform produces decomposed frames of temporal wavelet transforms, where the spatial correlation within each frame is well preserved. Second, a spatial 2-D wavelet transform is applied to all frames containing the temporal wavelet coefficients. The wavelet transforms produce coefficients within different sub-bands. The process then codes wavelet coefficients. In particular, the coefficients are assigned various contexts based on the significance of neighboring samples in previous, current, and next frame, thereby taking advantage of any motion information between frames. The wavelet coefficients are coded independently for each sub-band to permit easy separation at a decoder, making resolution scalability and temporal scalability natural and easy. During the coding, bits are allocated among sub-bands according to a technique that optimizes rate-distortion characteristics.
摘要:
A motion-compensated video encoding scheme employs progressive fine-granularity layered coding to encode macroblocks of video data into frames having multiple layers, including a base layer of comparatively low quality video and multiple enhancement layers of increasingly higher quality video. Some of the enhancement layers in a current frame are predicted from different quality layers in reference frames. The video encoding scheme estimates drifting errors during the encoding and chooses a coding mode for each macroblock in the enhancement layer to maximize high coding efficiency while minimizing drifting errors.
摘要:
Automatic video object extraction that defines substantially precise objects is disclosed. In one embodiment, color segmentation and motion segmentation are performed on a source video. The color segmentation segments the video by substantially uniform color regions thereof. The motion segmentation segments the video by moving regions thereof. The color regions and the moving regions are then combined to define the video objects. In varying embodiments, pre-processing and post-processing is performed to further clean the source video and the video objects defined, respectively.
摘要:
A seamless bitstream switching schema is presented. The schema takes advantage of both the high coding efficiency of non-scalable bitstreams and the flexibility of scalable bitstreams. Small bandwidth fluctuations are accommodated by the scalability of the bitstreams, while large bandwidth fluctuations are tolerated by switching among scalable bitstreams. This seamless bitstream switching schema significantly improves the efficiency of scalable video coding over a broad range of bit rates.
摘要:
A scalable layered video coding scheme that encodes video data frames into multiple layers, including a base layer of comparatively low quality video and multiple enhancement layers of increasingly higher quality video, adds error resilience to the enhancement layer. Unique resynchronization marks are inserted into the enhancement layer bitstream in headers associated with each video packet, headers associated with each bit plane, and headers associated with each video-of-plane (VOP) segment. Following transmission of the enhancement layer bitstream, the decoder tries to detect errors in the packets. Upon detection, the decoder seeks forward in the bitstream for the next known resynchronization mark. Once this mark is found, the decoder is able to begin decoding the next video packet. With the addition of many resynchronization marks within each frame, the decoder can recover very quickly and with minimal data loss in the event of a packet loss or channel error in the received enhancement layer bitstream. The video coding scheme also facilitates redundant encoding of header information from the higher-level VOP header down into lower level bit plane headers and video packet headers. Header extension codes are added to the bit plane and video packet headers to identify whether the redundant data is included.
摘要:
An image distribution system has a source that encodes digital images and transmits them over an error-prone channel to a destination. The source has an image coder that processes the digital images using vector transformation followed by vector quantization. This produces groups of vectors and quantized values that are representative of the images. The image coder orders the vectors in the codebooks and assigns vector indexes to the vectors such that a bit error occurring at a less significant bit in a vector index results in less distortion than a bit error occurring at a more significant bit. Depending upon the format and the capabilities of the source and destination, the image coder may allocate different numbers of bits to different groups of vectors according to a bit allocation map for this allocation process. The source also has a UEP (Unequal Error Protection) coder that layers the vector indexes according to their significance. Two possible approaches include frequency-based UEP and bit-plane based UEP. The source transmits a bitstream that includes the image values, a bit allocation map, and the layered vector indexes. The destination receives the bitstream and recovers the vectors using the vector indexes and bit allocation map. The destination then reconstructs the image from the image values and the vectors.
摘要:
A seamless bitstream switching schema is presented. The schema takes advantage of both the high coding efficiency of non-scalable bitstreams and the flexibility of scalable bitstreams. Small bandwidth fluctuations are accommodated by the scalability of the bitstreams, while large bandwidth fluctuations are tolerated by switching among scalable bitstreams. This seamless bitstream switching schema is significantly improves the efficiency of scalable video coding over a broad range of bit rates.