摘要:
A method and a system for self-calibrating a wide field-of-view camera (such as a catadioptric camera) using a sequence of omni-directional images of a scene obtained from the camera. The present invention uses the consistency of pairwise features tracked across at least a portion of the image collection and uses these tracked features to determine unknown calibration parameters based on the characteristics of catadioptric imaging. More specifically, the self-calibration method of the present invention generates a sequence of omni-directional images representing a scene and tracks features across the image sequence. An objective function is defined in terms of the tracked features and an error metric (an image-based error metric in a preferred embodiment). The catadioptric imaging characteristics are defined by calibration parameters, and determination of optimal calibration parameters is accomplished by minimizing the objective function using an optimizing technique. Moreover, the present invention also includes a technique for reformulating a projection equation such that the projection equation is equivalent to that of a rectilinear perspective camera. This technique allows analyses (such as structure from motion) to be applied (subsequent to calibration of the catadioptric camera) in the same direct manner as for rectilinear image sequences.
摘要:
A 3-D effect is added to a single image by adding depth to the single image. Depth can be added to the single image by selecting an arbitrary region or a number of pixels. A user interface simultaneously displays the single image and novel views of the single original image taken from virtual camera positions rotated relative to the original field of view. Depths given to the original image allow pixels to be reprojected onto the novel views to allow the user to observe the depth changes as they are being added. Functions are provided to edit gaps or voids generated in the process of adding depth to the single image. The gaps occur because of depth discontinuities between regions to which depth has been added and the voids are due to the uncovering of previously occluded surfaces in the original image.
摘要:
In an coder for producing a bitstream representative of a sequence of video images, a previous image is registered with a current image using spline-based registration to produce estimated motion vectors. The estimated motion vectors are used to match blocks of the previous image and the current image to produce translation vectors. The translation vectors compensate for motion while encoding the sequence as a bitstream.
摘要:
A computerized method and related computer system synthesize video from a plurality of sources of image data. The sources include a variety of image data types such a collection of image stills, a sequence of video frames, and 3-D models of objects. Each source provides image data associated with an object. One source provides image data associated with a first object, and a second source provides image data associated with a second object. The image data of the first and second objects are combined to generate composite images of the first and second objects. From the composite images, an output image of the first and second objects as viewed from an arbitrary viewpoint is generated. Gaps of pixels with unspecified pixel values may appear in the output image. Accordingly, a pixel value for each of these “missing pixels” is obtained by using an epipolar search process to determine which one of the sources of image data should provide the pixel value for that missing pixel.
摘要:
In a computerized method, the three-dimensional structure of an object is recovered from a closed-loop sequence of two-dimensional images taken by a camera undergoing some arbitrary motion. In one type of motion, the camera is held fixed, while the object completes a full 360.degree. rotation about an arbitrary axis. Alternatively, the camera can make a complete rotation about the object. In the sequence of images, feature tracking points are selected using pair-wise image registration. Ellipses are fitted to the feature tracking points to estimate the tilt of the axis of rotation. A set of variables are set to fixed values while minimizing an image-based objective function to extract a set of first structure and motion parameters. Then the set of variables freed while minimizing of the objective function continues to extract a second set of structure and motion parameters that are substantially the same as the first set of structure and motion parameters.
摘要:
A process for compressing and decompressing non-keyframes in sequential sets of contemporaneous video frames making up multiple video streams where the video frames in a set depict substantially the same scene from different viewpoints. Each set of contemporaneous video frames has a plurality frames designated as keyframes with the remaining being non-keyframes. In one embodiment, the non-keyframes are compressed using a multi-directional spatial prediction technique. In another embodiment, the non-keyframes of each set of contemporaneous video frames are compressed using a combined chaining and spatial prediction compression technique. The spatial prediction compression technique employed can be a single direction technique where just one reference frame, and so one chain, is used to predict each non-keyframe, or it can be a multi-directional technique where two or more reference frames, and so chains, are used to predict each non-keyframe.
摘要:
The stereo movie editing technique described herein combines knowledge of both multi-view stereo algorithms and human depth perception. The technique creates a digital editor, specifically for stereographic cinema. The technique employs an interface that allows intuitive manipulation of the different parameters in a stereo movie setup, such as camera locations and screen position. Using the technique it is possible to reduce or enhance well-known stereo movie effects such as cardboarding and miniaturization. The technique also provides new editing techniques such as directing the user's attention and easier transitions between scenes.
摘要:
A low light noise reduction mechanism may perform denoising prior to demosaicing, and may also use parameters determined during the denoising operation for performing demosaicing. The denoising operation may attempt to find several patches of an image that are similar to a first patch, and use a weighted average based on similarity to determine an appropriate value for denoising a raw digital image. The same weighted average and similar patches may be used for demosaicing the same image after the denoising operation.
摘要:
Systems and methods for video completion by motion field transfer are described. In one aspect, a spatio-temporal target patch of an input video data sequence is filled in or replaced by motion field transfer from a spatio-temporal source patch of the input video data sequence. Color is propagated to corresponding portions of the spatio-temporal target patch by treating the transferred motion information as directed edges. These motion field transfer and color propagation operations result in a video completed spatio-temporal target patch. The systems and methods present the video data sequence, which now includes the video completed spatio-temporal target patch, to user for viewing.
摘要:
An image enhancement system may match images to a matrix having various enhancements of images for groups of users. The matrix may define image enhancement settings for the particular images and groups of users, and the matching may apply enhancements to a new image that closely matches a user's preferences. After the matrix is initially populated, new users and new images may be added to increase the matrix's accuracy. The image enhancement system may be deployed as a cloud service, where images may be enhanced as a standalone application or as part of a social network or image sharing website. In some embodiments, the image enhancement system may be deployed on a personal computer or as a component of an image capture device.