摘要:
A method and apparatus for tracking multiple objects in a video sequence. The method defines a group of objects as a configuration, selects a configuration for a current video frame, predicts a configuration using a two-level process and computes the likelihood of the configuration. Using this method in an iterative manner on a sequence of frames, tracks the object group through the sequence.
摘要:
An image-based tele-presence system forward warps video images selected from a plurality fixed imagers using local depth maps and merges the warped images to form high quality images that appear as seen from a virtual position. At least two images, from the images produced by the imagers, are selected for creating a virtual image. Depth maps are generated corresponding to each of the selected images. Selected images are warped to the virtual viewpoint using warp parameters calculated using corresponding depth maps. Finally the warped images are merged to create the high quality virtual image as seen from the selected viewpoint. The system employs a video blanket of imagers, which helps both optimize the number of imagers and attain higher resolution. In an exemplary video blanket, cameras are deployed in a geometric pattern on a surface.
摘要:
A method of generating a dynamic depth map for a sequence of images from multiple cameras models a scene as a collection of 3D piecewise planar surface patches induced by color based image segmentation. This representation is continuously estimated using an incremental formulation in which the 3D geometric, motion, and global visibility constraints are enforced over space and time. The proposed algorithm optimizes a cost function that incorporates the spatial color consistency constraint and a smooth scene motion model.
摘要:
A system that tracks one or more moving objects in a sequence of video images employs a dynamic layer representation to represent the objects that are being tracked. The system concurrently estimates three components of the dynamic layer representation—layer segmentation, motion, and appearance—over time in a maximum a posteriori (MAP) framework. In order to enforce a global shape constraint and to maintain the layer segmentation over time, the subject invention imposes a prior constraint on parametric segmentation. In addition, the system uses a generalized Expectation-Maximization (EM) algorithm to compute an optimal solution. The system uses an object state that consists of representations of motion, appearance and ownership masks. The system applies a constant appearance model across multiple images in the video stream.
摘要:
A computer implemented method for detecting the presence of one or more pedestrians in the vicinity of the vehicle is disclosed. Imagery of a scene is received from at least one image capturing device. A depth map is derived from the imagery. A plurality of pedestrian candidate regions of interest (ROIs) is detected from the depth map by matching each of the plurality of ROIs with a 3D human shape model. At least a portion of the candidate ROIs is classified by employing a cascade of classifiers tuned for a plurality of depth bands and trained on a filtered representation of data within the portion of candidate ROIs to determine whether at least one pedestrian is proximal to the vehicle.
摘要:
A computer implemented method for detecting the presence of one or more pedestrians in the vicinity of the vehicle is disclosed. Imagery of a scene is received from at least one image capturing device. A depth map is derived from the imagery. A plurality of pedestrian candidate regions of interest (ROIs) is detected from the depth map by matching each of the plurality of ROIs with a 3D human shape model. At least a portion of the candidate ROIs is classified by employing a cascade of classifiers tuned for a plurality of depth bands and trained on a filtered representation of data within the portion of candidate ROIs to determine whether at least one pedestrian is proximal to the vehicle.
摘要:
A method and apparatus for accurately computing parallax information as captured by imagery of a scene. The method computes the parallax information of each point in an image by computing the parallax within windows that are offset with respect to the point for which the parallax is being computed. Additionally, parallax computations are performed over multiple frames of imagery to ensure accuracy of the parallax computation and to facilitate correction of occluded imagery.
摘要:
A method and apparatus for accurately computing parallax information as captured by imagery of a scene. The method computes the parallax information of each point in an image by computing the parallax within windows that are offset with respect to the point for which the parallax is being computed. Additionally, parallax computations are performed over multiple frames of imagery to ensure accuracy of the parallax computation and to facilitate correction of occluded imagery.
摘要:
The present invention provides a computer implemented process for detecting multi-view multi-pose objects. The process comprises training of a classifier for each intra-class exemplar, training of a strong classifier and combining the individual exemplar-based classifiers with a single objective function. This function is optimized using the two nested AdaBoost loops. The first loop is the outer loop that selects discriminative candidate exemplars. The second loop, the inner loop selects the discriminative candidate features on the selected exemplars to compute all weak classifiers for a specific position such as a view/pose. Then all the computed weak classifiers are automatically combined into a final classifier (strong classifier) which is the object to be detected.
摘要:
A method and apparatus for providing immersive surveillance wherein a remote security guard may monitor a scene using a variety of imagery sources that are rendered upon a model to provide a three-dimensional conceptual view of the scene. Using a view selector, the security guard may dynamically select a camera view to be displayed on his conceptual model, perform a walk through of the scene, identify moving objects and select the best view of those moving objects and so on.