摘要:
Methods and systems for generating stereoscopic content with granular control over binocular disparity based on multi-perspective imaging from representations of light fields are provided. The stereoscopic content is computed as piecewise continuous cuts through a representation of a light field, minimizing an energy reflecting prescribed parameters such as depth budget, maximum binocular disparity gradient, desired stereoscopic baseline. The methods and systems may be used for efficient and flexible stereoscopic post-processing, such as reducing excessive binocular disparity while preserving perceived depth or retargeting of already captured scenes to various view settings. Moreover, such methods and systems are highly useful for content creation in the context of multi-view autostereoscopic displays and provide a novel conceptual approach to stereoscopic image processing and post-production.
摘要:
Techniques are presented for controlling the amount of temporal noise in certain animation sequences. Sketchy animation sequences are received in an input in a digital form and used to create an altered version of the same animation with temporal coherence enforced down to the stroke level, resulting in a reduction of the perceived noise. The amount of reduction is variable and can be controlled via a single parameter to achieve a desired artistic effect.
摘要:
A method is provided for an optimized stereoscopic camera with low processing overhead, especially suitable for real-time applications. By constructing a viewer-centric and scene-centric model, the mapping of scene depth to perceived depth may be defined as an optimization problem, for which a solution is analytically derived based on constraints to stereoscopic camera parameters including interaxial separation and convergence distance. The camera parameters may thus be constrained prior to rendering to maintain a desired perceived depth volume around a stereoscopic display, for example to ensure user comfort or provide artistic effects. To compensate for sudden scene depth changes due to unpredictable camera or object movements, as may occur with real-time applications such as video games, the constraints may also be temporally interpolated to maintain a linearly corrected and approximately constant perceived depth range over time.
摘要:
A computer-implemented method for estimating a pose of an articulated object model (4), wherein the articulated object model (4) is a computer based 3D model (1) of a real world object (14) observed by one or more source cameras (9), and wherein the pose of the articulated object model (4) is defined by the spatial location of joints (2) of the articulated object model (4), comprises the steps of obtaining a source image (10) from a video stream; processing the source image (10) to extract a source image segment (13); maintaining, in a database, a set of reference silhouettes, each being associated with an articulated object model (4) and a corresponding reference pose; comparing the source image segment (13) to the reference silhouettes and selecting reference silhouettes by taking into account, for each reference silhouette, a matching error that indicates how closely the reference silhouette matches the source image segment (13) and/or a coherence error that indicates how much the reference pose is consistent with the pose of the same real world object (14) as estimated from a preceding source image (10); retrieving the corresponding reference poses of the articulated object models (4); and computing an estimate of the pose of the articulated object model (4) from the reference poses of the selected reference silhouettes.
摘要:
An animation system can vectorize an image by generating, from an input drawing, a dataset corresponding to vector and digital representations of the input drawing such that a rendering engine could render an image having features in common with the input drawing from the representations, as a collection of strokes and/or objects rather than merely a collection of pixels having pixel color values. A vectorizer might receive an input image, generate a particle clustering data structure from a digitization of the input image, generate a stroke list, wherein strokes in the stroke list correspond to clusters of particles represented in the particle clustering data structure, generate a graph structure that represents connections between strokes on the stroke list, and determine additional characteristics of a stroke beyond the path of the stroke, additional characteristics being stored such that they correspond to strokes. The strokes might be generated using global topology information.
摘要:
Methods and apparatus for visual saliency estimation for images and video are described. In an embodiment, a process includes decomposing, by a processor, an image into elements, wherein each element includes at least one pixel. The processor then calculates a first image measure indicative of each element's uniqueness in the image on a per element basis, and a second image measure indicative of each element's spatial distribution in the image on a per element basis. A per element saliency measure is provided by combining the first image measure and the second image measure, or by utilizing the first image measure, or by utilizing the second image measure.
摘要:
A method is provided for an optimized stereoscopic camera with low processing overhead, especially suitable for real-time applications. By constructing a viewer-centric and scene-centric model, the mapping of scene depth to perceived depth may be defined as an optimization problem, for which a solution is analytically derived based on constraints to stereoscopic camera parameters including interaxial separation and convergence distance. The camera parameters may thus be constrained prior to rendering to maintain a desired perceived depth volume around a stereoscopic display, for example to ensure user comfort or provide artistic effects. To compensate for sudden scene depth changes due to unpredictable camera or object movements, as may occur with real-time applications such as video games, the constraints may also be temporally interpolated to maintain a linearly corrected and approximately constant perceived depth range over time.
摘要:
An integrated system and method for content-aware video retargeting. An interactive framework combines key frame-based constraint editing with numerous automatic algorithms for video analysis. This combination gives content producers a high level of control of the retargeting process. One component of the framework is a non-uniform, pixel-accurate warp to the target resolution that considers automatic as well as interactively-defined features. Automatic features comprise video saliency, edge preservation at the pixel resolution, and scene cut detection to enforce bilateral temporal coherence. Additional high level constraints can be added by the producer to achieve a consistent scene composition across arbitrary output formats. Advantageously, embodiments of the invention provide a better visual result for retargeted video when compared to using conventional techniques.
摘要:
Techniques are provided for content-aware video retargeting. An interactive framework combines key frame-based constraint editing with numerous automatic algorithms for video analysis. This combination gives content producers a high level of control of the retargeting process. One component of the framework is a non-uniform, pixel-accurate warp to the target resolution that considers automatic as well as interactively-defined features. Automatic features comprise video saliency, edge preservation at the pixel resolution, and scene cut detection to enforce bilateral temporal coherence. Additional high level constraints can be added by the producer to achieve a consistent scene composition across arbitrary output formats. Advantageously, embodiments of the invention provide a better visual result for retargeted video when compared to using conventional techniques.
摘要:
Systems, methods and articles of manufacture are disclosed for performing scalable video coding. In one embodiment, non-linear functions are used to predict source video data using retargeted video data. Differences may be determined between the predicted video data and the source video data. The retargeted video data, the non-linear functions, and the differences may be jointly encoded into a scalable bitstream. The scalable bitstream may be transmitted and selectively decoded to produce output video for one of a plurality of predefined target platforms.