摘要:
Method and apparatus for encoding a picture in video sequence are disclosed. An apparatus includes a library creator for creating a first library from an original version of the picture and a second library from a reconstructed version of the picture. Each library includes high resolution replacement patches for replacing pruned blocks during a recovery of a pruned version of the picture. A pruner generates the pruned version from the first library. A metadata generator generates metadata from the second library for recovering the pruned version. An encoder encodes the pruned version and metadata. The first library includes patch clusters. The pruned version is generated by dividing the original version into overlapping blocks and searching for candidate patch clusters for each block. A patch dependency graph having nodes and edges is used for the searching. Each node represents a respective block, and each edge represents a respective dependency of the respective block.
摘要:
A caption detection system wherein all detected caption boxes over time for one caption area are identical, thereby reducing temporal instability and inconsistency. This is achieved by grouping candidate pixels in the 3D spatiotemporal space and generating a 3D bounding box for one caption area. 2D bounding boxes are obtained by slicing the 3D bounding boxes, thereby reducing temporal instability as all 2D bounding boxes corresponding to a caption area are sliced from one 3D bounding box and are therefore identical over time.
摘要:
A system and method for spatiotemporal depth extraction of images with forward and backward depth prediction are provided. The system and method of the present disclosure provide for acquiring a plurality of frames, generating a first depth map of a current frame in the plurality of frames based on a depth map of a previous frame in the plurality of frames, generating a second depth map of the current frame in the plurality of frames based on a depth map of a subsequent frame in the plurality of frames, and processing the first depth map and the second depth map to produce a third depth map for the current frame.
摘要:
Methods and systems for a multimedia queue management solution that maintaining graceful Quality of Experience (QoE) degradation are provided. The method selects a frame from all weighted queues based on a gradient function indicating a network performance rate change and a distortion rate caused by the frame and its related frames in the queue, and dropping the selected frame and all its related frames, and continues to drop similarly chosen frame until a network performance rate change caused by the dropping frame and its related frames meets a predetermined performance metric. A frame gradient is a distortion rate divided by a network performance rate change caused by the frame and its related frames, and a distortion rate is based on a sum of each individual frame distortion rate when the frame and its related frames are replaced by some other frames derived from remaining frames based on a replacement method.
摘要:
Methods and Systems are provided for converting a two-dimensional image sequence into a three-dimensional image. In one embodiment, a method for converting a two-dimensional image sequence into a three-dimensional image includes determining camera motion parameters between consecutive images in a monoscopic sequence of reference 2D images, wherein the consecutive images comprise a current reference image and an adjacent image, determining a horizontal disparity map for a target image using the camera motion parameters, determining a disparity probability value for each disparity vector of the disparity map, and determining a target image as a weighted average of pixel values in the current reference image using the disparity probability values, such that the target image and current reference image comprise a stereoscopic image pair.
摘要:
A system and method for recovering three-dimensional (3D) particle systems from two-dimensional (2D) images are provided. The system and method of the present invention provide for identifying a fuzzy object in a two-dimensional (2D) image; selecting a particle system from a plurality of predetermined particle systems, the selected particle system relating to a predefined fuzzy object; generating at least one particle of the selected particle system; simulating the at least one particle to update states of the at least one particle; rendering the selected particle system; comparing the rendered particle system to the identified fuzzy object in the 2D image; and storing the selected particle system if the comparison result is within an acceptable threshold, wherein the stored particle system represents the recovered geometry of the fuzzy object.
摘要:
A method, apparatus and system for synchronizing between two recording modes includes identifying a common event in the two recording modes. The event in time is recognized for a higher accuracy mode of the two modes. The event is predicted in a lower accuracy mode of the two modes by determining a time when the event occurred between frames in the lower accuracy mode. The event in the higher accuracy mode is synchronized to the lower accuracy mode to provide sub-frame accuracy alignment between the two modes. In one embodiment of the invention, the common event includes the closing of a clap slate, and the two modes include audio and video recording modes.
摘要:
A system and method for an efficient semantic similarity search of images with a classification structure are provided. The system and method provide for building a semantic classification-search tree for the plurality of images, the classification tree including at least two categories of images, each category of images representing a subset of the plurality of images, receiving a query image, classifying the query image to select one of the at least two categories of images, and restricting the search for the image of interest using the query image to the selected one of the at least two categories of images.
摘要:
A system and method for color correction of 3D images including at least two separate image streams captured for a same scene include determining three-dimensional properties of at least a portion of a selected image stream, the three-dimensional properties including light and surface reflectance properties, surface color, reflectance properties, scene geometry and the like. A look of the portion of the selected image stream is then modified by altering the value of at least one of the determined three-dimensional properties and, in one embodiment, applying image formation theory. The modifications are then rendered in an output 3D picture either automatically and/or according to user inputs. In various embodiments, corrections made to the selected one of the at least two image streams can be automatically applied to the other of the image streams.
摘要:
Various implementations relate to improving depth maps. This may be done, for example, by identifying bad depth values and modifying those values. The values may represent, for example, holes and/or noise. According to a general aspect, a segmentation is determined based on an intensity image. The intensity image is associated with a corresponding depth image that includes depth values for corresponding locations in the intensity image. The segmentation is applied to the depth image to segment the depth image into multiple regions. A depth value is modified in the depth image based on the segmentation. A two-stage iterative procedure may be used to improve the segmentation and then modify bad depth values in the improved segmentation, and iterating until a desired level of smoothness is achieved. Both stages may be based, for example, on average depth values in a segment.