Abstract:
Techniques for objectively determining perceived video/image quality, the techniques including receiving a degraded bit-stream comprising encoded video/image data, and subsequently parsing the bit-stream to extract one or more video/image coding components. The video coding components may include intra-prediction modes, discrete cosine transform (DCT) coefficients, motion information, or combinations thereof, and may be used as a basis for objectively predicting a Quality of Experience (QoE) or Motion Opinion Score (MOS) score of the degraded bit-stream.
Abstract:
Techniques for objectively determining perceived video/image quality, the techniques including receiving a degraded bit-stream comprising encoded video/image data, and subsequently parsing the bit-stream to extract one or more video/image coding components. The video coding components may include intra-prediction modes, discrete cosine transform (DCT) coefficients, motion information, or combinations thereof, and may be used as a basis for objectively predicting a Quality of Experience (QoE) or Motion Opinion Score (MOS) score of the degraded bit-stream.
Abstract:
A system and method for spatiotemporal depth extraction of images with forward and backward depth prediction are provided. The system and method of the present disclosure provide for acquiring a plurality of frames, generating a first depth map of a current frame in the plurality of frames based on a depth map of a previous frame in the plurality of frames, generating a second depth map of the current frame in the plurality of frames based on a depth map of a subsequent frame in the plurality of frames, and processing the first depth map and the second depth map to produce a third depth map for the current frame.
Abstract:
Methods and systems for a multimedia queue management solution that maintaining graceful Quality of Experience (QoE) degradation are provided. The method selects a frame from all weighted queues based on a gradient function indicating a network performance rate change and a distortion rate caused by the frame and its related frames in the queue, and dropping the selected frame and all its related frames, and continues to drop similarly chosen frame until a network performance rate change caused by the dropping frame and its related frames meets a predetermined performance metric. A frame gradient is a distortion rate divided by a network performance rate change caused by the frame and its related frames, and a distortion rate is based on a sum of each individual frame distortion rate when the frame and its related frames are replaced by some other frames derived from remaining frames based on a replacement method.
Abstract:
Methods and Systems are provided for converting a two-dimensional image sequence into a three-dimensional image. In one embodiment, a method for converting a two-dimensional image sequence into a three-dimensional image includes determining camera motion parameters between consecutive images in a monoscopic sequence of reference 2D images, wherein the consecutive images comprise a current reference image and an adjacent image, determining a horizontal disparity map for a target image using the camera motion parameters, determining a disparity probability value for each disparity vector of the disparity map, and determining a target image as a weighted average of pixel values in the current reference image using the disparity probability values, such that the target image and current reference image comprise a stereoscopic image pair.
Abstract:
A method of object-aware video coding is provided that comprises the steps of: receiving a video sequence having a plurality of frames; selecting at least two frames; determing total area of at least one object of interest in each of the at least two frames; comparing the total area to a threshold area; classifying each of the at least two frames as being a low object weighted frame or a high object weighted frame, low object weighted frames being frames having the total area exceeding the threshold area and high object weighted frames being frame having the total area not exceeding the threshold area; and encoding each low object weighted frame according to one encoding mode and encoding each high object weighted frame according to a different encoding mode.
Abstract:
A system and method for recovering three-dimensional (3D) particle systems from two-dimensional (2D) images are provided. The system and method of the present invention provide for identifying a fuzzy object in a two-dimensional (2D) image; selecting a particle system from a plurality of predetermined particle systems, the selected particle system relating to a predefined fuzzy object; generating at least one particle of the selected particle system; simulating the at least one particle to update states of the at least one particle; rendering the selected particle system; comparing the rendered particle system to the identified fuzzy object in the 2D image; and storing the selected particle system if the comparison result is within an acceptable threshold, wherein the stored particle system represents the recovered geometry of the fuzzy object.
Abstract:
A method, apparatus and system for synchronizing between two recording modes includes identifying a common event in the two recording modes. The event in time is recognized for a higher accuracy mode of the two modes. The event is predicted in a lower accuracy mode of the two modes by determining a time when the event occurred between frames in the lower accuracy mode. The event in the higher accuracy mode is synchronized to the lower accuracy mode to provide sub-frame accuracy alignment between the two modes. In one embodiment of the invention, the common event includes the closing of a clap slate, and the two modes include audio and video recording modes.
Abstract:
Method and apparatus for recovering a pruned version of a picture in a video sequence are disclosed. The apparatus includes a divider for dividing the pruned version of the picture into a plurality of non-overlapping blocks. The apparatus also includes a metadata decoder for decoding metadata for use in recovering the pruned version of the picture. The apparatus further includes a patch library creator for creating a patch library from a reconstructed version of the picture. The patch library includes a plurality of high resolution replacement patches for replacing the one or more pruned blocks during a recovery of the pruned version of the picture. The apparatus additionally includes a search and replacement device for performing a searching process using the metadata to find a corresponding patch for a respective one of the one or more pruned blocks from among the plurality of non-overlapping blocks and replace the respective one of the one or more pruned blocks with the corresponding patch. The signature is respectively created for each of the one or more pruned blocks, and the pruned version of the picture is recovered by comparing respective distance metrics from signatures for each of the plurality of high resolution patches to signatures for each of the one or more pruned blocks, sorting the respective distance metrics to obtain a rank list for each of the one or more pruned blocks, wherein a rank number in the rank list for a particular one of the one or more pruned blocks is used to retrieve a corresponding one of the plurality of high resolution patches in the patch library to be used to replace the particular one of the one or more pruned blocks. A patch dependency graph having a plurality of nodes and a plurality of edges is used to recover the pruned version of the picture. Each of the plurality of nodes represents a respective one of the plurality of overlapping blocks, and each of the plurality of edges represents a respective dependency of at least the respective one of the plurality of overlapping blocks.
Abstract:
A system and method for an efficient semantic similarity search of images with a classification structure are provided. The system and method provide for building a semantic classification-search tree for the plurality of images, the classification tree including at least two categories of images, each category of images representing a subset of the plurality of images, receiving a query image, classifying the query image to select one of the at least two categories of images, and restricting the search for the image of interest using the query image to the selected one of the at least two categories of images.