Abstract:
A power-scalable hybrid technique to reduce blocking and ringing artifacts in low bit-rate block-based video coding is employed in connection with a modified decoder structure. Fast inverse motion compensation is applied directly in the compressed domain, so that the transform (e.g., DCT) coefficients of the current frame can be reconstructed from those of the previous frame. The spatial characteristics of each block is calculated from the DCT coefficients, and each block is classified as either low-activity or high-activity. For each low-activity block, its DC coefficient value and the DC coefficient values of the surrounding eight neighbor blocks are exploited to predict low frequency AC coefficients which reflect the original coefficients before quantization in the encoding stage. The predicted AC coefficients are inserted into the low activity blocks where blocking artifacts are most noticeable. Subject to available resources, this may be followed by spatial domain post-processing, in which two kinds of low-pass filters are adaptively applied, on a block-by-block basis, according to the classification of the particular block. Strong low-pass filtering is applied in low-activity blocks where the blocking artifacts are most noticeable, whereas weak low-pass filtering is applied in high-activity blocks where ringing noise as well as blocking artifacts may exist. In low activity blocks, the blocking artifacts are reduced by one dimensional horizontal and vertical low-pass filters which are selectively applied in either the horizontal and/or vertical direction depending on the locations and absolute values of the predicted AC coefficients. In high activity blocks, de-blocking and de-ringing is conducted by 2- or 3-tap filters, applied horizontally and/or vertically, which makes the architecture simple.
Abstract:
A layered presentation system (LAPE) includes a server that performs compressed-domain image processing on image data received from multiple clients including a master client and other clients to generate a composite image that incorporates imagery from the other clients with a master image from the master client for viewing on a shared display. The system's clients can add imagery in the form of questions, comments, and graphics to a currently displayed image. The added imagery is processed along with the master image to generate the composite image that then appears on the shared display and perhaps also on each client's individual display. The processing includes scaling the master image/added imagery, as required, and blending and/or overlaying the added imagery onto the master image so as to augment but not obscure it. A network protocol is included for sending image data in the compressed domain back and forth between the server and each of the clients.
Abstract:
An original image is sharpened by obtaining a first frequency-domain representation of the original image, selecting one or more elements from this first representation based on one more criteria such as element magnitude and frequency, scaling the selected elements according to one or more scale factors, and forming a second frequency-domain representation by combining the scaled selected elements with the unselected elements of the first representation. A sharpened reproduction of the original image may be generated by applying an inverse transform to the second frequency-domain representation. A technique for deriving the value of the one or more scale factors is also discussed.
Abstract:
An automatic testing method and device is described that can test a video sequence coder/decoder system and either assess the quality of decoded sequences or rate the fidelity of the coding chain. The device produces synthetic test patterns that induce the appearance of known artifacts, then tracks and evaluates such artifacts. Based on this evaluation, it can rate the system's performance in a way that correlates well with human assessments. In our testing device, the quality estimation module performs this function.
Abstract:
Downsampling and inverse motion compensation are performed on compressed domain representations for video. By directly manipulating the compressed domain representation instead of the spatial domain representation, computational complexity is significantly reduced. For downsampling, the compressed stream is processed in the compressed (DCT) domain without explicit decompression and spatial domain downsampling so that the resulting compressed stream corresponds to a scaled down image, ensuring that the resulting compressed stream conforms to the standard syntax of 8.times.8 DCT matrices. For typical data sets, this approach of downsampling in the compressed domain results in computation savings around 80% compared with traditional spatial domain methods for downsampling from compressed data. For inverse motion compensation, motion compensated compressed video is converted into a sequence of DCT domain blocks corresponding to the spatial domain blocks in the current picture alone. By performing inverse motion compensation directly in the compressed domain, the reduction in computation complexity is around 68% compared with traditional spatial domain methods for inverse motion compensation from compressed data. The techniques for downsampling and inverse motion compensation can be used in a variety of applications, such as multipoint video conferencing and video editing.
Abstract:
In one example, a method includes identifying a pixel in an image frame that is a candidate for causing crosstalk between the image frame and a corresponding image frame in a multiview image system. The method further includes, for a pixel identified as a candidate for causing crosstalk, applying crosstalk correction to the pixel. The method further includes applying a location-based adjustment to the pixel, wherein the location-based adjustment is based at least in part on which of two or more portions of the image frame the pixel is in.
Abstract:
Systems and methods for compressing video data are provided. The method includes segmenting a video frame, selecting a coding mode, and encoding. The segmenting includes segmenting the video frame of the video data into a sequence of coding blocks. The selecting includes selecting the coding mode from a plurality of coding modes. The selecting of the coding mode is based on an allowable bit budget and occurs for each coding block. The encoding includes encoding each coding block based on the coding mode. The allowable bit budget varies according to a bit utilization of prior encoded coding blocks and varies such that the video frame does not exceed a specified compression ratio.
Abstract:
In one example, a method includes identifying a pixel in an image frame that is a candidate for causing crosstalk between the image frame and a corresponding image frame in a multiview image system. The method further includes, for a pixel identified as a candidate for causing crosstalk, applying crosstalk correction to the pixel. The method further includes applying a location-based adjustment to the pixel, wherein the location-based adjustment is based at least in part on which of two or more portions of the image frame the pixel is in.
Abstract:
In one example, a method includes identifying a first set of pixels in co-located pairs in a corresponding pair of multiview image frames for which the co-located pairs have a disparity between the pixels that is greater than a selected disparity threshold. The method further includes identifying a second set of pixels in at least one of the image frames that are within a selected distance of an intensity transition greater than a selected intensity transition threshold. The method further includes applying crosstalk correction to pixels that are identified as being in at least one of the first set and the second set.
Abstract:
An apparatus and method are described for filtering noise internally within a video encoding framework. In various embodiments of the invention, an in-loop noise filter is integrated within an encoding device or framework that reduces noise along a motion trajectory within a digital video signal. This integration of in-loop noise reduction allows both noise filtering parameters and encoding parameters to be more easily related and adjusted. The in-loop noise filter leverages characteristics of digital video encoding processes to reduce noise on a video signal and improve encoding efficiencies of a codec.