摘要:
An object recognition system is described that incorporates swarming classifiers with attention mechanisms. The object recognition system includes a cognitive map having a one-to-one relationship with an input image domain. The cognitive map records information that software agents utilize to focus a cooperative swarm's attention on regions likely to contain objects of interest. Multiple agents operate as a cooperative swarm to classify an object in the domain. Each agent is a classifier and is assigned a velocity vector to explore a solution space for object solutions. Each agent records its coordinates in multi-dimensional space that are an observed best solution that the agent has identified, and a global best solution that is used to store the best location among all agents. Each velocity vector thereafter changes to allow the swarm to concentrate on the vicinity of the object and classify the object when a classification level exceeds a preset threshold.
摘要:
A method, an apparatus, and a computer program product for three-dimensional shape estimation using constrained disparity propagation are presented. An act of receiving a stereoscopic pair of images of an area occupied by at least one object is performed. Next, pattern regions and non-pattern regions are detected in the images. An initial estimate of śpatial disparities between the pattern regions in the images is generated. The initial estimate is used to generate a subsequent estimate of the spatial disparities between the non-pattern regions. The subsequent estimate is used to generate further subsequent estimates of the spatial disparities using the disparity constraints until there is no change between the results of subsequent iterations, generating a final estimate of the spatial disparities. A disparity map of the area occupied by at least one object is generated from the final estimate of the three-dimensional shape.
摘要:
An object recognition system is described that incorporates swarming classifiers. The swarming classifiers comprise a plurality of software agents configured to operate as a cooperative swarm to classify an object in a domain as seen from multiple view points. Each agent is a complete classifier and is assigned an initial velocity vector to explore a solution space for object solutions. Each agent is configured to perform an iteration, the iteration being a search in the solution space for a potential solution optima where each agent keeps track of its coordinates in multi-dimensional space that are associated with an observed best solution (pbest) that the agent has identified, and a global best solution (gbest) where the gbest is used to store the best location among all agents. Each velocity vector changes towards pbest and gbest, allowing the cooperative swarm to concentrate on the vicinity of the object and classify the object.
摘要:
A method and system for video-content based retrieval is described. A query video depicting an activity is processed using interest point selection to find locations in the video that are relevant to that activity. A set of spatio-temporal descriptors such as self-similarity and 3-D SIFT are calculated within a local neighborhood of the set of interest points. An indexed video database containing videos similar to the query video is searched using the set of descriptors to obtain a set of candidate videos. The videos in the video database are indexed hierarchically using a vocabulary tree or other hierarchical indexing mechanism.
摘要:
The present invention relates to an object detection and behavior recognition system using three-dimensional motion data. The system receives three-dimensional (3D) motion data of a scene from at least one sensor, such as a LIDAR sensor. An object is identified in the 3D motion data. Thereafter, an object track is extracted, the object track being indicative of object motion in the scene over time. Through Dynamic Time Warping (DTW) or other comparison techniques, the object track is compared to a database to identify the behavior of the object based on its object track.
摘要:
Described is a system and method for detecting elevated structures, such as bridges and overpasses, in point cloud data. A set of data from a three-dimensional point cloud of a landscape is received by the system. The set of data points comprises inlier data points and outlier data points. The inlier data points in the three-dimensional point cloud data are identified and combined into at least one segment. The segment is converted into an image comprising at least one image level. Each image level is processed with an edge detection algorithm to detect elevated edges. The elevated edges are vectorized to identify an elevated structure of interest in the landscape. The present invention is useful in applications that require three-dimensional sensing systems, such as autonomous navigation and surveillance applications.
摘要:
Described is a knowledge-enhanced compressive imaging system. The system first initializes a compressive measurement basis set and a measurement matrix using task- and scene-specific prior knowledge. An image captured using the imaging mode of the dual-mode sensor is then sampled to extract context knowledge. The compressive measurement basis set and the measurement matrix are adapted using the extracted context knowledge and the prior knowledge. Task-relevant compressive measurements of the image are performed using the compressive measurement mode of the dual-mode sensor, and compressive reconstruction of the image is performed. Finally, a task and context optimized signal representation of the image is generated.
摘要:
Described is a behavior recognition system for detecting the behavior of objects in a scene. The system comprises a semantic object stream module for receiving a video stream having at least two frames and detecting objects in the video stream. Also included is a group organization module for utilizing the detected objects from the video stream to detect a behavior of the detected objects. The group organization module further comprises an object group stream module for spatially organizing the detected objects to have relative spatial relationships. The group organization module also comprises a group action stream module for modeling a temporal structure of the detected objects. The temporal structure is an action of the detected objects between the two frames, whereby through detecting, organizing and modeling actions of objects, a user can detect the behavior of the objects.
摘要:
Described is a system for object recognition in colorized point clouds. The system includes an implicit geometry engine that is configured to receive three-dimensional (3D) colorized cloud point data regarding a 3D object of interest and to convert the cloud point data into implicit representations. The engine also generates geometric features. A geometric grammar block is included to generate object cues and recognize geometric objects using geometric tokens and grammars based on object taxonomy. A visual attention cueing block is included to generate object cues based on 3D geometric properties. Finally, an object recognition block is included to perform a local search for objects using cues from the cueing block and the geometric grammar block and to classify the 3D object of interest as a particular object upon a classifier reaching a predetermined threshold.
摘要:
A method and system for a directed area search using cognitive swarm vision and cognitive Bayesian reasoning is disclosed. The system comprises a domain knowledge database, a top-down reasoning module, and a bottom-up module. The domain knowledge database is configured to store Bayesian network models comprising visual features and observables associated with various sets of entities. The top-down module is configured to receive a search goal, generate a plan of action using Bayesian network models, and partition the plan into a set of tasks/observables to be located in the imagery. The bottom-up module is configured to select relevant feature/attention models for the observables, and search the visual imagery using a cognitive swarm for the at least one observable. The system further provides for operator feedback and updating of the domain knowledge database to perform better future searches.