摘要:
A system and process for rendering and displaying an interactive viewpoint video is presented in which a user can watch a dynamic scene while manipulating (freezing, slowing down, or reversing) time and changing the viewpoint at will. The ability to interactively control viewpoint while watching a video is an exciting new application for image-based rendering. Because any intermediate view can be synthesized at any time, with the potential for space-time manipulation, this type of video has been dubbed interactive viewpoint video.
摘要:
A system and process for generating a two-layer, 3D representation of a digital or digitized image from the image and a pixel disparity map of the image is presented. The two layer representation includes a main layer having pixels exhibiting background colors and background disparities associated with correspondingly located pixels of depth discontinuity areas in the image, as well as pixels exhibiting colors and disparities associated with correspondingly located pixels of the image not found in these depth discontinuity areas. The other layer is a boundary layer made up of pixels exhibiting foreground colors, foreground disparities and alpha values associated with the correspondingly located pixels of the depth discontinuity areas. The depth discontinuity areas correspond to prescribed sized areas surrounding depth discontinuities found in the image using a disparity map thereof.
摘要:
A system and process for reconstructing optimal texture maps from multiple views of a scene is described. In essence, this reconstruction is based on the optimal synthesis of textures from multiple sources. This is generally accomplished using basic image processing theory to derive the correct weights for blending the multiple views. Namely, the steps of reconstructing, warping, prefiltering, and resampling are followed in order to warp reference textures to a desired location, and to compute spatially-variant weights for optimal blending. These weights take into consideration the anisotropy in the texture projection and changes in sampling frequency due to foreshortening. The weights are combined and the computation of the optimal texture is treated as a restoration problem, which involves solving a linear system of equations. This approach can be incorporated in a variety of applications, such as texturing of 3D models, analysis by synthesis methods, super-resolution techniques, and view-dependent texture mapping.
摘要:
A system and process for generating High Dynamic Range (HDR) video is presented which involves first capturing a video image sequence while varying the exposure so as to alternate between frames having a shorter and longer exposure. The exposure for each frame is set prior to it being captured as a function of the pixel brightness distribution in preceding frames. Next, for each frame of the video, the corresponding pixels between the frame under consideration and both preceding and subsequent frames are identified. For each corresponding pixel set, at least one pixel is identified as representing a trustworthy pixel. The pixel color information associated with the trustworthy pixels is then employed to compute a radiance value for each pixel set to form a radiance map. A tone mapping procedure can then be performed to convert the radiance map into an 8-bit representation of the HDR frame.
摘要:
A system and process for reconstructing optimal texture maps from multiple views of a scene is described. In essence, this reconstruction is based on the optimal synthesis of textures from multiple sources. This is generally accomplished using basic image processing theory to derive the correct weights for blending the multiple views. Namely, the steps of reconstructing, warping, prefiltering, and resampling are followed in order to warp reference textures to a desired location, and to compute spatially-variant weights for optimal blending. These weights take into consideration the anisotropy in the texture projection and changes in sampling frequency due to foreshortening. The weights are combined and the computation of the optimal texture is treated as a restoration problem, which involves solving a linear system of equations. This approach can be incorporated in a variety of applications, such as texturing of 3D models, analysis by synthesis methods, super-resolution techniques, and view-dependent texture mapping.
摘要:
Methods and apparatus for storing, accessing, and processing information representing images through the use of row and column pointers are described. By manipulating and/or generating new sets of row and column pointers, many image processing operations can be performed without the need to access or copy the original image data. Padding, enlargement and reduction operations are examples of image processing operations that can be performed virtually. A logical image is created as the result of a virtual image processing operation. In order to permit the fast and efficient access of the image data which represents the logical image, the logical image is divided into safe and unsafe logical image regions. In a safe logical image region, data representing the image is regularly spaced in memory and may be accessed using a first relatively fast and efficient memory access technique. In unsafe logical image regions, the data representing the logical image is not regularly spaced in memory and is accessed using a second memory access technique that uses both the row and column pointers associated with the unsafe image region. The methods and apparatus of the present invention allow many image processing operations to be performed using less memory and/or by performing fewer computations than conventional image processing techniques.
摘要:
The invention is embodied in a deghosting method and apparatus which locally aligns individual images in a set of overlapping images of a mosaic. This is accomplished by determining, at plural predetermined pixel locations of each one of the images, motions between the one image and other images of the set, combining the motions to produce an estimated motion at each of the plural predetermined pixel locations of the one image, and then warping the one image in accordance with the estimated motions. Preferably, it is first which of the images of the set overlies the one image. This determination is made by determining alignment transformations relating the images to a 3-dimensional coordinate system and then inferring mutual overlap between images from the transformations. The images are resampled in accordance with these alignment transformations. The warping of each image is accomplished by constructing a mapping of warped pixel locations from the estimated motions and then resampling the one image at the warped pixel locations. The mapping is preferably a reverse mapping of pixels in an unwarped version of the one image.
摘要:
A system is described for reducing artifacts produced by a rolling shutter capture technique in the presence of high-frequency motion, e.g., produced by large accelerations or jitter. The system operates by computing low-frequency information based on the motion of points from one frame to the next. The system then uses the low-frequency information to infer the high-frequency motion, e.g., by treating the low-frequency information as known integrals of the unknown underlying high-frequency information. The system then uses the high-frequency information to reduce the presence of artifacts. In effect, the correction aims to re-render video information as though all the pixels in each frame were imaged at the same time using a global shutter technique. An auto-calibration module can estimate the value of a capture parameter, which relates to a time interval between the capture of two subsequent rows of video information.
摘要:
Multi-spline image blending technique embodiments are presented which generally employ a separate low-resolution offset field for every image region being blended, rather than a single (piecewise smooth) offset field for all the regions to produce a visually consistent blended image. Each of the individual offset fields is smoothly varying, and so is represented using a low-dimensional spline. A resulting linear system can be rapidly solved because it involves many fewer variables than the number of pixels being blended.
摘要:
A process for compressing and decompressing non-keyframes in sequential sets of contemporaneous video frames making up multiple video streams where the video frames in a set depict substantially the same scene from different viewpoints. Each set of contemporaneous video frames has a plurality frames designated as keyframes with the remaining being non-keyframes. In one embodiment, the non-keyframes are compressed using a multi-directional spatial prediction technique. In another embodiment, the non-keyframes of each set of contemporaneous video frames are compressed using a combined chaining and spatial prediction compression technique. The spatial prediction compression technique employed can be a single direction technique where just one reference frame, and so one chain, is used to predict each non-keyframe, or it can be a multi-directional technique where two or more reference frames, and so chains, are used to predict each non-keyframe.