摘要:
Methods and apparatus for visual saliency estimation for images and video are described. In an embodiment, a process includes decomposing, by a processor, an image into elements, wherein each element includes at least one pixel. The processor then calculates a first image measure indicative of each element's uniqueness in the image on a per element basis, and a second image measure indicative of each element's spatial distribution in the image on a per element basis. A per element saliency measure is provided by combining the first image measure and the second image measure, or by utilizing the first image measure, or by utilizing the second image measure.
摘要:
Methods and apparatus for visual saliency estimation for images and video are described. In an embodiment, a process includes decomposing, by a processor, an image into elements, wherein each element includes at least one pixel. The processor then calculates a first image measure indicative of each element's uniqueness in the image on a per element basis, and a second image measure indicative of each element's spatial distribution in the image on a per element basis. A per element saliency measure is provided by combining the first image measure and the second image measure, or by utilizing the first image measure, or by utilizing the second image measure.
摘要:
Techniques are disclosed for controlling robot pixels to display a visual representation of a real-world video texture. Mobile robots with controllable color may generate visual representations of the real-world video texture to create an effect like fire, sunlight on water, leaves fluttering in sunlight, a wheat field swaying in the wind, crowd flow in a busy city, and clouds in the sky. The robot pixels function as a display device for a given allocation of robot pixels. Techniques are also disclosed for distributed collision avoidance among multiple non-holonomic and holonomic robots to guarantee smooth and collision-free motions.
摘要:
Techniques are provided for content-aware video retargeting. An interactive framework combines key frame-based constraint editing with numerous automatic algorithms for video analysis. This combination gives content producers a high level of control of the retargeting process. One component of the framework is a non-uniform, pixel-accurate warp to the target resolution that considers automatic as well as interactively-defined features. Automatic features comprise video saliency, edge preservation at the pixel resolution, and scene cut detection to enforce bilateral temporal coherence. Additional high level constraints can be added by the producer to achieve a consistent scene composition across arbitrary output formats. Advantageously, embodiments of the invention provide a better visual result for retargeted video when compared to using conventional techniques.
摘要:
Techniques are presented for controlling the amount of temporal noise in certain animation sequences. Sketchy animation sequences are received in an input in a digital form and used to create an altered version of the same animation with temporal coherence enforced down to the stroke level, resulting in a reduction of the perceived noise. The amount of reduction is variable and can be controlled via a single parameter to achieve a desired artistic effect.
摘要:
Methods and systems for generating stereoscopic content with granular control over binocular disparity based on multi-perspective imaging from representations of light fields are provided. The stereoscopic content is computed as piecewise continuous cuts through a representation of a light field, minimizing an energy reflecting prescribed parameters such as depth budget, maximum binocular disparity gradient, desired stereoscopic baseline. The methods and systems may be used for efficient and flexible stereoscopic post-processing, such as reducing excessive binocular disparity while preserving perceived depth or retargeting of already captured scenes to various view settings. Moreover, such methods and systems are highly useful for content creation in the context of multi-view autostereoscopic displays and provide a novel conceptual approach to stereoscopic image processing and post-production.
摘要:
A method is provided for an optimized stereoscopic camera with low processing overhead, especially suitable for real-time applications. By constructing a viewer-centric and scene-centric model, the mapping of scene depth to perceived depth may be defined as an optimization problem, for which a solution is analytically derived based on constraints to stereoscopic camera parameters including interaxial separation and convergence distance. The camera parameters may thus be constrained prior to rendering to maintain a desired perceived depth volume around a stereoscopic display, for example to ensure user comfort or provide artistic effects. To compensate for sudden scene depth changes due to unpredictable camera or object movements, as may occur with real-time applications such as video games, the constraints may also be temporally interpolated to maintain a linearly corrected and approximately constant perceived depth range over time.
摘要:
A computer-implemented method for estimating a pose of an articulated object model (4), wherein the articulated object model (4) is a computer based 3D model (1) of a real world object (14) observed by one or more source cameras (9), and wherein the pose of the articulated object model (4) is defined by the spatial location of joints (2) of the articulated object model (4), comprises the steps of obtaining a source image (10) from a video stream; processing the source image (10) to extract a source image segment (13); maintaining, in a database, a set of reference silhouettes, each being associated with an articulated object model (4) and a corresponding reference pose; comparing the source image segment (13) to the reference silhouettes and selecting reference silhouettes by taking into account, for each reference silhouette, a matching error that indicates how closely the reference silhouette matches the source image segment (13) and/or a coherence error that indicates how much the reference pose is consistent with the pose of the same real world object (14) as estimated from a preceding source image (10); retrieving the corresponding reference poses of the articulated object models (4); and computing an estimate of the pose of the articulated object model (4) from the reference poses of the selected reference silhouettes.
摘要:
An animation system can vectorize an image by generating, from an input drawing, a dataset corresponding to vector and digital representations of the input drawing such that a rendering engine could render an image having features in common with the input drawing from the representations, as a collection of strokes and/or objects rather than merely a collection of pixels having pixel color values. A vectorizer might receive an input image, generate a particle clustering data structure from a digitization of the input image, generate a stroke list, wherein strokes in the stroke list correspond to clusters of particles represented in the particle clustering data structure, generate a graph structure that represents connections between strokes on the stroke list, and determine additional characteristics of a stroke beyond the path of the stroke, additional characteristics being stored such that they correspond to strokes. The strokes might be generated using global topology information.
摘要:
A method is provided for an optimized stereoscopic camera with low processing overhead, especially suitable for real-time applications. By constructing a viewer-centric and scene-centric model, the mapping of scene depth to perceived depth may be defined as an optimization problem, for which a solution is analytically derived based on constraints to stereoscopic camera parameters including interaxial separation and convergence distance. The camera parameters may thus be constrained prior to rendering to maintain a desired perceived depth volume around a stereoscopic display, for example to ensure user comfort or provide artistic effects. To compensate for sudden scene depth changes due to unpredictable camera or object movements, as may occur with real-time applications such as video games, the constraints may also be temporally interpolated to maintain a linearly corrected and approximately constant perceived depth range over time.