摘要:
An image is processed by a sensed-feature-based classifier to generate a list of objects assigned to classes. The most prominent objects (those objects whose classification is most likely reliable) are selected for range estimation and interpolation. Based on the range estimation and interpolation, the sensed features are converted to physical features for each object. Next, that subset of objects is then run through a physical-feature-based classifier that re-classifies the objects. Next, the objects and their range estimates are re-run through the processes of range estimation and interpolation, sensed-feature-to-physical-feature conversion, and physical-feature-based classification iteratively to continuously increase the reliability of the classification as well as the range estimation. The iterations are halted when the reliability reaches a predetermined confidence threshold. In a preferred embodiment, a next subset of objects having the next highest prominence in the same image is selected and the entire iterative process is repeated. This set of iterations will include evaluation of both of the first and second subsets of objects. The process can be repeated until all objects have been classified.
摘要:
A system and method detects the intent and/or motivation of two or more persons or other animate objects in a video scene. In one embodiment, the system forms a blob of the two or more persons, draws a bounding box around said blob, calculates an entropy value for said blob, and compares that entropy value to a threshold to determine if the two or more persons are involved in a fight or other altercation.
摘要:
In an embodiment, one or more sequences of learning video data is provided. The learning video sequences include an action. One or more features of the action are extracted from the one or more sequences of learning video data. Thereafter, a reception of a sequence of operational video data is enabled, and an extraction of the one or more features of the action from the sequence of operational video data is enabled. A comparison is then enabled between the extracted one or more features of the action from the one or more sequences of learning video data and the one or more features of the action from the sequence of operational video data. In an embodiment, this comparison allows the determination of whether the action in present in the operational video data.
摘要:
A system for tracking objects across an area having a network of cameras with overlapping and non-overlapping fields of view. The system may use a combination of color, shape, texture and/or multi-resolution histograms for object representation or target modeling for the tacking of an object from one camera to another. The system may include user and output interfacing.
摘要:
Systems and methods for transforming two-dimensional image data into a 3D dense range map are disclosed. An illustrative method may include the steps of acquiring at least one image frame from an image sensor, selecting at least one region of interest within the image frame, determining the geo-location of three or more reference points within each selected region of interest, and transforming 2D image domain data from each selected region of interest into a 3D dense range map containing physical features of one or more objects within the image frame. The 3D dense range map can be used to calculate physical feature vectors of objects disposed within each defined region of interest. An illustrative video surveillance system may include an image sensor adapted to acquire images from at least one region of interest, a graphical user interface for displaying images acquired from the image sensor within an image frame, and a processor for determining the geo-location of one ore more objects within the image frame. The processor can be configured to run an algorithm or routine adapted to transform two-dimensional data received from the image sensor into a 3D range map containing physical features of one or more objects within the image frame.
摘要:
Methods and systems for the unsupervised learning of events contained within a video sequence, including apparatus and interfaces for implementing such systems and methods, are disclosed. An illustrative method in accordance with an exemplary embodiment of the present invention may include the steps of providing a behavioral analysis engine, initiating a training phase mode within the behavioral analysis engine and obtaining a feature vector including one or more parameters relating to an object located within an image sequence, and then analyzing the feature vector to determine a number of possible event candidates. The behavioral analysis engine can be configured to prompt the user to confirm whether an event candidate is a new event, an existing event, or an outlier. Once trained, a testing/operational phase mode of the behavioral analysis engine can be further implemented to detect the occurrence of one or more learned events, if desired.
摘要:
A multi-spectral imaging surveillance system and method in which a plurality of imaging cameras is associated with a data-processing apparatus. A module can be provided, which resides in a memory of said data-processing apparatus. The module performs fusion of a plurality images respectively generated by varying imaging cameras among said plurality of imaging cameras. Fusion of the images is based on a plurality of parameters indicative of environmental conditions in order to achieve enhanced imaging surveillance thereof. The final fused images are the result of two parts: an image fusion portion, and a knowledge representation part. For the final fusion, many operators can be utilized, which can be applied between the image fusion result and the knowledge representation portion.
摘要:
A face detection and recognition system having several arrays imaging a scene in the infrared and visible spectrums. The system may use weighted subtracting and thresholding to distinguish human skin in a sensed image. A feature selector may locate a face in the image. The image may be cropped with a frame or border incorporating essentially only the face. The border may be superimposed on images from an infrared imaging array and the visible imaging array. Sub-images containing the face may be extracted from within the border on the infrared and visible images, respectively, and compared with a database of face information to attain recognition of the face. Confidence levels of recognition for infrared and visible imaged faces may be established. A resultant confidence level of recognition may be determined from these confidence levels. Infrared lighting may be used as needed to illuminate the scene.
摘要:
Systems and methods of establishing 3D coordinates from 2D image domain data acquired from an image sensor are disclosed. An illustrative method may include the steps of acquiring at least one image frame from the image sensor, selecting at least one polygon defining a region of interest within the image frame, measuring the distance from an origin of the image sensor to a number of reference points on the polygon, determining the distance between the selected reference points, and then determining 3D reference coordinates for one or more points on the polygon using a data fusion technique in which 2D image data from the image sensor is geometrically converted to 3D coordinates based at least in part on measured values of the reference points. An interpolation technique can be used to determine the 3D coordinates for all of the pixels within the polygon.
摘要:
A method for navigation comprises constructing a current map that includes two-dimensional or three dimensional representations of an area, detecting one or more edge features on the current map, and generating a first fine-edge map based on the edge features. The method further comprises retrieving a historical map that includes two-dimensional or three dimensional representations of the area, detecting one or more edge features on the historical map, and generating a second fine-edge map based on the edge features. Thereafter, a coarse version of the current map is generated from the first fine-edge map, and a coarse version of the historical map is generated from the second fine-edge map. The coarse versions of the current and historical maps are then correlated to determine a first position and orientation. The first fine-edge map is then correlated with the second fine-edge map to determine a second, more accurate, position and orientation.