摘要:
A quality of a virtual image for a synthetic viewpoint in a 3D scene is determined. The 3D scene is acquired by texture images, and each texture image is associated with a depth image acquired by a camera arranged at a real viewpoint. A texture noise power is based on the acquired texture images and reconstructed texture images corresponding to a virtual texture image. A depth noise power is based on the depth images and reconstructed depth images corresponding to a virtual depth image. The quality of the virtual image is based on a combination of the texture noise power and the depth noise power, and the virtual image is rendered from the reconstructed texture images and the reconstructed depth images.
摘要:
A quality of a virtual image for a synthetic viewpoint in a 3D scene is determined. The 3D scene is acquired by texture images, and each texture image is associated with a depth image acquired by a camera arranged at a real viewpoint. A texture noise power is based on the acquired texture images and reconstructed texture images corresponding to a virtual texture image. A depth noise power is based on the depth images and reconstructed depth images corresponding to a virtual depth image. The quality of the virtual image is based on a combination of the texture noise power and the depth noise power, and the virtual image is rendered from the reconstructed texture images and the reconstructed depth images.
摘要:
A bitstream includes a sequence of frames. Each frame is partitioned into encoded blocks. For each block, a set of paths is determined at a transform angle determined from a transform index in the bitstream. Transform coefficients are obtained from bitstream. The transform coefficients include one DC coefficient for each path. An inverse transform is applied to the transform coefficients to produce a decoded video.
摘要:
A method acquires a plurality of input videos. The frames of each input video are acquired at a fixed sampling rate. Joint analysis is applied concurrently and in parallel to the input videos to determine a variable and non-uniform temporal sampling rate for each input video so that a combined distortion is minimized and a combined frame rate constraint is satisfied. Each input video is then sampled at the associated variable and non-uniform temporal sampling rate to produce output videos having variable temporal resolutions.
摘要:
A method and system reduces the spatial resolution of a compressed bitstream of a sequence of frames of a video signal by first decoding the frames, and storing the decoded frames in a first frame buffer. While performing the decoding, motion compensating is performed with full resolution motion vectors of the stored decoded frames. The decoded frames are then down-sampled to a reduced resolution, and stored in a second frame buffer. The reduced resolution frames are partially encoded to produce a reduced resolution compressed bitstream of the video. While performing the partial encoding, motion compensation is performed with reduced resolution motion vectors of the stored reduced resolution frames.
摘要:
A model stored in a memory accessible by a video transcoder includes a first rate-distortion function modeling a requantization of an input video. A second-rate distortion function models a resynchronization marker insertion rate for the transcoded video, and a third rate-distortion function models an intra-block insertion rate for the transcoded video.
摘要:
A method classifies pixels in an image by first partitioning the image into blocks. A variance of an intensity is determined for each pixel, and for each block the pixel with the maximum variance is identified. Then, the blocks are classified into classes according to the maximum variance.
摘要:
A method extracts high-level features from a video including a sequence of frames. Low-level features are extracted from each frame of the video. Each frame of the video is labeled according to the extracted low-level features to generate sequences of labels. Each sequence of labels is associated with one of the extracted low-level feature. The sequences of labels are analyzed using learning machine learning techniques to extract high-level features of the video.
摘要:
A method determines distortion in a video by measuring a spatial distortion in coded frames, and by measuring a temporal distortion and spatial distortion in uncoded frames. The spatial distortion of the coded frames is combined with the temporal distortion and the spatial distortion of the uncoded frames to determine a total average distortion in the video.
摘要:
A method estimates rate and distortion characteristics of a video object. First and second object shape features are respectively extracted at a first and second resolution of the video object. First and second rate distortion characteristics of the video object are respectively determined from the extracted first and second object shape features according to first and second modeling parameters. The extracted object shape features can be discrete, such as states of binary shape patterns of the video object, or the object shape features can be continuous such as a set of statistical moments representing a probability density function of the video object.