摘要:
A method and apparatus for processing images. A sequence of images for a scene is received from an imaging system. An object in the scene is detected using the sequence of images. A viewpoint of the imaging system is registered to a model of the scene using a region in the model of the scene in which an expected behavior of the object is expected to occur.
摘要:
A method and apparatus for processing images. A sequence of images for a scene is received from an imaging system. An object in the scene is detected using the sequence of images. A viewpoint of the imaging system is registered to a model of the scene using a region in the model of the scene in which an expected behavior of the object is expected to occur.
摘要:
Described is a system for registering a viewpoint of an imaging sensor with respect to a geospatial model or map. An image of a scene of a geospatial region comprising an object is received as input. The image of the scene is captured by a sensor having a current sensor state. Observation data related to the object's state is received, wherein the observation data comprises an object behavior of the object given the geospatial region. An estimate of the current sensor state is generated using a probability of an observation from the observation data given the current sensor state x. Finally, the image of the scene is registered with a geospatial model or map based on the estimate of the current sensor state.
摘要:
The present invention relates to a classifier cascade object detection system. The system operates by inputting an image patch into parallel feature generation modules, each of the feature generation modules operable for extracting features from the image patch. The features are provided to an opportunistic classifier cascade, the opportunistic classifier cascade having a series of classifier stages. The opportunistic classifier cascade is executed by progressively evaluating, in each classifier in the classifier cascade, the features to produce a response, with each response progressively utilized by a decision function to generate a stage response for each classifier stage. If each stage response exceeds a stage threshold then the image patch is classified as a target object, and if the stage response from any of the decision functions does not exceed the stage threshold, then the image patch is classified as a non-target object.
摘要:
Described is a system for multi-class classifier threshold-offset estimation for visual object recognition. The system receives an input image with input features for classifying. A pair-wise classifier is trained for each pair of a plurality of object classes. A set of classification responses is generated, and a multi-class receiver-operating-characteristics (ROC) curve is computed for a set of threshold-offsets. An objective function of classification performance is computed from the ROC curve and optimized using particle swarm optimization (PSO) to generate a set of optimized threshold-offsets. The optimized threshold-offsets are then applied to the classification responses. The resulting classification responses are compared to a predetermined value to classify each input feature as belonging to one object class or another. The tuning of the threshold-offsets with (PSO) improves classification performance in a visual object recognition system.
摘要:
Described is a method and system for embedding unsupervised learning into three critical processing stages of the spatio-temporal visual stream. The system first receives input video comprising input video pixels representing at least one action and at least one object having a location. Microactions are generated from the input image using a set of motion sensitive filters. A relationship between the input video pixels and the microactions is then learned, and a set of spatio-temporal concepts is learned from the microactions. The system then learns to acquire new knowledge from the spatio-temporal concepts using mental imagery processes. Finally, a visual output is presented to a user based on the learned set of spatio-temporal concepts and the new knowledge to aid the user in visually comprehending the at least one action in the input video.