摘要:
Methods and apparatus are provided for reducing vector quantization error through patch shifting. A method generates, from an input video sequence, one of more high resolution replacement patches, the one or more high resolution replacement patches for replacing one or more low resolution patches during a reconstruction of the input video sequence. This generating step generates the one or more high resolution replacement patches using data corresponding to a patch spatial shifting process, the patch spatial shifting process for reducing jittery artifacts caused by a motion-induced vector quantization error in the one or more high resolution replacement patches, the data for at least deriving a patch size of the one or more high resolution replacement patches such that the one or more high resolution replacement patches are generated to have the patch size greater than a patch size of the one or more low resolution patches in order to be suitable for use in the patch spatial shifting process.
摘要:
Method and apparatus for recovering a pruned version of a picture in a video sequence are disclosed. The apparatus includes a divider for dividing the pruned version of the picture into a plurality of non-overlapping blocks. The apparatus also includes a metadata decoder for decoding metadata for use in recovering the pruned version of the picture. The apparatus further includes a patch library creator for creating a patch library from a reconstructed version of the picture. The patch library includes a plurality of high resolution replacement patches for replacing the one or more pruned blocks during a recovery of the pruned version of the picture. The apparatus additionally includes a search and replacement device for performing a searching process using the metadata to find a corresponding patch for a respective one of the one or more pruned blocks from among the plurality of non-overlapping blocks and replace the respective one of the one or more pruned blocks with the corresponding patch. The signature is respectively created for each of the one or more pruned blocks, and the pruned version of the picture is recovered by comparing respective distance metrics from signatures for each of the plurality of high resolution patches to signatures for each of the one or more pruned blocks, sorting the respective distance metrics to obtain a rank list for each of the one or more pruned blocks, wherein a rank number in the rank list for a particular one of the one or more pruned blocks is used to retrieve a corresponding one of the plurality of high resolution patches in the patch library to be used to replace the particular one of the one or more pruned blocks. A patch dependency graph having a plurality of nodes and a plurality of edges is used to recover the pruned version of the picture. Each of the plurality of nodes represents a respective one of the plurality of overlapping blocks, and each of the plurality of edges represents a respective dependency of at least the respective one of the plurality of overlapping blocks.
摘要:
Techniques for objectively determining perceived video/image quality, the techniques including receiving a degraded bit-stream comprising encoded video/image data, and subsequently parsing the bit-stream to extract one or more video/image coding components. The video coding components may include intra-prediction modes, discrete cosine transform (DCT) coefficients, motion information, or combinations thereof, and may be used as a basis for objectively predicting a Quality of Experience (QoE) or Motion Opinion Score (MOS) score of the degraded bit-stream.
摘要:
A method is disclosed for analyzing video to detect far-view scenes in sports video to determine when certain image processing algorithms should be applied. The method comprises analyzing and classifying the fields of view of images from a video signal, creating and classifying the fields of view of sets of sequential images, and selectively applying image processing algorithms to sets of sequential images representing a particular type of field of view.
摘要:
An embodiment is configured to calculate a perceptual masking factor at a pixel location at a block boundary of the image, calculate a parameter for a filter at the pixel location at the block boundary, and filter the image around the pixel location at the block boundary employing the filter with the calculated parameter. The perceptual masking factor is formed as a product of a background activity masking map and a luminance masking map. The filter includes a parameter that is selected in view of a quality of experience performance measure for the image at the pixel location at the block boundary of the image.
摘要:
Methods and apparatus are provided for decoding video signals using motion compensated example-based super-resolution for video compression. An apparatus includes an example-based super-resolution processor for receiving one or more high resolution replacement patch pictures generated from a static version of an input video sequence having motion, and performing example-based super-resolution to generate a reconstructed version of the static version of the input video sequence from the one or more high resolution replacement patch pictures. The reconstructed version of the static version of the input video sequence includes a plurality of pictures. The apparatus further includes an inverse image warper for receiving motion parameters for the input video sequence, and performing an inverse picture warping process based on the motion parameters to transform one or more of the plurality of pictures to generate a reconstruction of the input video sequence having the motion.
摘要:
Method and apparatus are provided for efficient reference data encoding for video compression by image content based search and ranking. An apparatus includes a rank transformer for respectively transforming reference data for each of a plurality of candidate reference blocks with respect to a current block to be encoded into a respective rank number there for based on a context feature of the current block with respect to the context feature of each of the plurality of candidate reference blocks. The apparatus further includes an entropy encoder for respectively entropy encoding the respective rank number for each of the plurality of candidate reference blocks with respect to the current block in place of, and representative of, the reference data for each of the plurality of candidate reference blocks with respect to the current block.
摘要:
Methods and apparatus are provided for encoding video signals using example-based data pruning for improved video compression efficiency. An apparatus for encoding a picture in a video sequence includes a patch library creator for creating a first patch library from an original version of the picture and a second patch library from a reconstructed version of the picture. Each of the first patch library and the second patch library includes a plurality of high resolution replacement patches for replacing one or more pruned blocks during a recovery of a pruned version of the picture. The apparatus also includes a pruner for generating the pruned version of the picture from the first patch library, and a metadata generator for generating metadata from the second patch library. The metadata is for recovering the pruned version of the picture. The apparatus further includes an encoder for encoding the pruned version of the picture and the metadata.
摘要:
A caption detection system wherein all detected caption boxes over time for one caption area are identical, thereby reducing temporal instability and inconsistency. This is achieved by grouping candidate pixels in the 3D spatiotemporal space and generating a 3D bounding box for one caption area. 2D bounding boxes are obtained by slicing the 3D bounding boxes, thereby reducing temporal instability as all 2D bounding boxes corresponding to a caption area are sliced from one 3D bounding box and are therefore identical over time.
摘要:
A new high dynamic range image synthesis which can handle the local object motion, wherein an interactive graphical user interface is provided for the end user, through which one can specify the source image for separate part of the final high dynamic range image, either by creating a image mask or scribble on the image. The high dynamic range image synthesis includes the following steps: capturing low dynamic range images with different exposures; registering the low dynamic range images; estimating camera response function; converting the low dynamic range images to temporary radiance images using estimated camera response function; and fusing the temporary radiance images into a single high dynamic range (HDR) image by employing a method of layered masking.