Abstract:
A method and system for determining a quality metric score for image processing are described including accepting a reference image, performing a pyramid transformation on the accepted reference image to produce a predetermined number of scales, applying image division to each scale to produce reference image patches, accepting a distorted image, performing a pyramid transformation on the accepted distorted image to produce the predetermined number of scales, applying image division to each scale to produce distorted image patches, performing a local distortion calculation for corresponding reference and distorted image patches, summing local distortion calculation results for image patch pairs, multiplying results of the summation operation by a positive weight for each scale, summing the results of the multiplication operation and applying a sigmoid function to results of the second summation operation to produce the quality metric score.
Abstract:
A particular implementation decomposes an image into a structure component and a texture component. An edge strength map is calculated for the structure component, and a texture strength map is calculated for the texture component. Using the edge strength and the texture strength, texture masking weights are calculated. The stronger the texture strength is, or the weaker the edge strength is, the more distortion can be tolerated by human eyes, and thus, the smaller the texture masking weight is. The local distortions are then weighted by the texture masking weights to generate an overall distortion level or an overall quality metric.
Abstract:
A particular implementation forms an initial reconstructed image block from inverse quantization and inverse transform, and further refines the reconstructed image block using pixels from neighboring reconstructed blocks. The image block may be refined using a bilateral filter, whose space parameter and range parameter are adaptive to the quantization parameter. The particular implementation can be used in both encoding and decoding when reconstructing an image block. When used in encoding, the particular implementation can be used jointly with coefficient truncation, where some non-zero transform coefficients are set to zero. The number of remaining non-zero transform coefficients after coefficient truncation may be adaptive to the quantization parameter, the variance of the image block, the number of non-zero transform coefficients of the image block, and the index of the last non- zero transform coefficient in a zigzag scanning order.
Abstract:
The invention proposes modification of quantized coefficients for signalling of a post-processing method. Therefore, it is proposed a method for lossy compress- encoding data comprising at least one of image data and audio data. Said method comprises determining quantized coefficients using a quantization of a discrete cosine transformed residual of a prediction of said data. Said method further comprises modifying said quantized coefficients for minimizing rate-distortion cost wherein distortion is determined using a post-processed reconstruction of the data, the post-processed reconstruction being post-processed according to a post¬ processing method, and compress-encoding said modified coefficients. In said proposed method, the post-processing method is that one of n>l different predetermined post processing method candidates whose position in an predetermined order of arrangement of the post processing method candidates equals a remainder of division, by n, of a sum of the modified coefficients. Doing so removes the overhead of flags in the bit stream.
Abstract:
Methods and apparatuses for image regularization are described. The image regularization is performed through total variation (TV) lθ regularization, wherein 0 θ lθ regularization is then performed for each image block using the calculated gradient blocks. The processed is performed iteratively to alleviate the problem of non-regularized neighboring blocks affecting the regularization of the current block. The TV lθ regularization is applicable to applications such as image denoising, video compression, and exposure fusion.
Abstract:
Various implementations relate to providing a pictorial summary, also referred to as a comic book or a narrative abstraction.In one particular implementation, a first portion in a video is accessed, and a second portion in the video is accessed. A weight for the first portion is determined, and a weight for the second portion is determined. A first number and a second number are determined. The first number identifies how many pictures from the first portion are to be used in a pictorial summary of the video. The first number is one or more, and is determined based on the weight for the first portion. The second number identifies how many pictures from the second portion are to be used in the pictorial summary of the video. The second number is one or more, and is determined based on the weight for the second portion.
Abstract:
Accuracy and efficiency of video quality measurement are major problems to be solved. According to the invention, a method (506) for accurately predicting video quality uses a rational function of the quantization parameter QP, which is corrected by a correction function that depends on content unpredictability CU. Exemplarily, the correction function is a power function of the CU. Both QP and CU can be computed (511) from the video elementary stream, without full decoding the video. This ensures high efficiency.
Abstract:
Because neighboring frames may affect how a current frame is perceived, we examine different neighborhoods of the current frame and select a neighborhood that impacts the perceived temporal distortion (i.e., when frames are viewed continuously) of the current frame most significantly. Based on spatial distortion (i.e., when a frame is viewed independently of other frames in a video sequence) of frames in the selected neighborhood, we can estimate initial temporal distortion. To refine the initial temporal distortion, we also consider the distribution of distortion in the selected neighborhood, for example, the distance between the current frame and a closest frame with large distortion, or whether distortion occurs in consecutive frames.
Abstract:
Spatial distortion (i.e., when a frame is viewed independently of other frames in a video sequence) may be quite different from temporal distortion (i.e., when frames are viewed continuously). To estimate temporal distortion, a sliding window approach is used. Specifically, multiple sliding windows around a current frame are considered. Within each sliding window, a large distortion density is calculated and a sliding window with the highest large distortion density is selected. A distance between the current frame and the closest frame with large distortion in the selected window is calculated. Subsequently, the temporal distortion is estimated as a function of the highest large distortion ratio, the spatial distortion for the current frame, and the distance. In another embodiment, a median of spatial distortion values is calculated for each sliding window and the maximum of median spatial distortion values is used to estimate the temporal distortion.
Abstract:
Various implementations relate to providing a pictorial summary, also referred to as a comic book or a narrative abstraction. In one particular implementation, one or more parameters from a configuration guide are accessed. The configuration guide includes one or more parameters for configuring a pictorial summary of a video. The video is accessed. The pictorial summary for the video is generated. The pictorial summary conforms to the one or more accessed parameters from the configuration guide.