摘要:
A system and method for efficiently locating in 3D an object of interest in a target scene using video information captured by a plurality of cameras. The system and method provide for multi-camera visual odometry wherein pose estimates are generated for each camera by all of the cameras in the multi-camera configuration. Furthermore, the system and method can locate and identify salient landmarks in the target scene using any of the cameras in the multi-camera configuration and compare the identified landmark against a database of previously identified landmarks. In addition, the system and method provide for the integration of video-based pose estimations with position measurement data captured by one or more secondary measurement sensors, such as, for example, Inertial Measurement Units (IMUs) and Global Positioning System (GPS) units.
摘要:
A method and apparatus for providing three-dimensional navigation for a node comprising an inertial measurement unit for providing gyroscope, acceleration and velocity information (collectively IMU information); a ranging unit for providing distance information relative to at least one reference node; at least one visual sensor for providing images of an environment surrounding the node; a preprocessor, coupled to the inertial measurement unit, the ranging unit and the plurality of visual sensors, for generating error states for the IMU information, the distance information and the images; and an error-state predictive filter, coupled to the preprocessor, for processing the error states to produce a three-dimensional pose of the node.
摘要:
A system and method for efficiently locating in 3D an object of interest in a target scene using video information captured by a plurality of cameras. The system and method provide for multi-camera visual odometry wherein pose estimates are generated for each camera by all of the cameras in the multi-camera configuration. Furthermore, the system and method can locate and identify salient landmarks in the target scene using any of the cameras in the multi-camera configuration and compare the identified landmark against a database of previously identified landmarks. In addition, the system and method provide for the integration of video-based pose estimations with position measurement data captured by one or more secondary measurement sensors, such as, for example, Inertial Measurement Units (IMUs) and Global Positioning System (GPS) units.
摘要:
An apparatus for providing three-dimensional pose comprising monocular visual sensors for providing images of an environment surrounding the apparatus, an inertial measurement unit (IMU) for providing gyroscope, acceleration and velocity information, collectively IMU information, a feature tracking module for generating feature tracking information for the images, and an error-state filter, coupled to the feature track module, the IMU and the one or more visual sensors, for correcting IMU information and producing a pose estimation based on at least one error-state model chosen according to the sensed images, the IMU information and the feature tracking information.
摘要:
The present invention provides a system and method for processing real-time rapid capture, annotation and creation of an annotated hyper-video map for environments. The method includes processing video, audio and GPS data to create the hyper-video map which is further enhanced with textual, audio and hyperlink annotations that will enable the user to see, hear, and operate in an environment with cognitive awareness. Thus, this annotated hyper-video map provides a seamlessly navigable, situational awareness and indexable high-fidelity immersive visualization of the environment.
摘要:
A method for estimating pose from a sequence of images, which includes the steps of detecting at least three feature points in both the left image and right image of a first pair of stereo images at a first point in time; matching the at least three feature points in the left image to the at least three feature points in the right image to obtain at least three two-dimensional feature correspondences; calculating the three-dimensional coordinates of the at least three two-dimensional feature correspondences to obtain at least three three-dimensional reference feature points; tracking the at least three feature points in one of the left image and right image of a second pair of stereo images at a second point in time different from the first point in time to obtain at least three two-dimensional reference feature points; and calculating a pose based on the at least three three-dimensional reference feature points and its corresponding two-dimensional reference feature points in the stereo images. The pose is found by minimizing projection residuals of a set of three-dimensional reference feature points in an image plane.
摘要:
A system and method for efficiently locating in 3D an object of interest in a target scene using video information captured by a plurality of cameras. The system and method provide for multi-camera visual odometry wherein pose estimates are generated for each camera by all of the cameras in the multi-camera configuration. Furthermore, the system and method can locate and identify salient landmarks in the target scene using any of the cameras in the multi-camera configuration and compare the identified landmark against a database of previously identified landmarks. In addition, the system and method provide for the integration of video-based pose estimations with position measurement data captured by one or more secondary measurement sensors, such as, for example, Inertial Measurement Units (IMUs) and Global Positioning System (GPS) units.
摘要:
A method for estimating pose from a sequence of images, which includes the steps of detecting at least three feature points in both the left image and right image of a first pair of stereo images at a first point in time; matching the at least three feature points in the left image to the at least three feature points in the right image to obtain at least three two-dimensional feature correspondences; calculating the three-dimensional coordinates of the at least three two-dimensional feature correspondences to obtain at least three three-dimensional reference feature points; tracking the at least three feature points in one of the left image and right image of a second pair of stereo images at a second point in time different from the first point in time to obtain at least three two-dimensional reference feature points; and calculating a pose based on the at least three three-dimensional reference feature points and its corresponding two-dimensional reference feature points in the stereo images. The pose is found by minimizing projection residuals of a set of three-dimensional reference feature points in an image plane.
摘要:
A method for detecting a moving target is disclosed that receives a plurality of images from at least one camera; receives a measurement of scale from one of a measurement device and a second camera; calculates the pose of the at least one camera over time based on the plurality of images and the measurement of scale; selects a reference image and an inspection image from the plurality of images of the at least one camera; and detects a moving target from the reference image and the inspection image based on the orientation of corresponding portions in the reference image and the inspection image relative to a location of an epipolar direction common to the reference image and the inspection image; and displays any detected moving target on a display. The measurement of scale can derived from a second camera or, for example, a wheel odometer. The method can also detect moving targets by combining the above epipolar method with a method based on changes in depth between the inspection image and the reference image and based on changes in flow between the inspection image and the reference image.
摘要:
The present invention relates to a system and method for detecting one or more targets belonging to a first class (e.g., moving and/or stationary people), from a moving platform in a 3D-rich environment. The framework described here is implemented using a number of monocular or stereo cameras distributed around the vehicle to provide 360 degrees coverage. Furthermore, the framework described here utilizes numerous filters to reduce the number of false positive identifications of the targets.