摘要:
A method for representing images for pattern classification extends the conventional Isomap method with Fisher Linear Discriminant (FLD) or Kernel Fisher Linear Discriminant (KFLD) for classification. The extended Isomap method estimates the geodesic distance of data points corresponding to images for pattern classification, and uses pairwise geodesic distances as feature vectors. The method applies FLD to the feature vectors to find an optimal projection direction to maximize the distances between cluster centers of the feature vectors. The method may apply KFLD to the feature vectors instead of FLD.
摘要:
The advantage of the present invention is to appropriately detect the object. The object detection apparatus in the present invention has a plurality of cameras to determine the distance to the objects, a distance determination unit to determine the distance therein, a histogram generation unit to specify the frequency of the pixels against the distances to the pixels, an object distance determination unit that determines the most likely distance, a probability mapping unit that provides the probabilities of the pixels based on the difference of the distance, a kernel detection unit that determines a kernel region as a group of the pixels, a periphery detection unit that determines a peripheral region as a group of the pixels, selected from the pixels being close to the kernel region and an object specifying unit that specifies the object region where the object is present with a predetermined probability.
摘要:
A face recognition system and method project an input face image and a set of reference face images from an input space to a high dimensional feature space in order to obtain more representative features of the face images. The Kernel Fisherfaces of the input face image and the reference face images are calculated, and are used to project the input face image and the reference face images to a face image space lower in dimension than the input space and the high dimensional feature space. The input face image and the reference face images are represented as points in the face image space, and the distance between the input face point and each of the reference image points are used to determine whether or not the input face image resembles a particular face image of the reference face images.
摘要:
A system and a method are disclosed for adaptive probabilistic tracking of an object within a motion video. The method utilizes a time-varying Eigenbasis and dynamic, observation and inference models. The Eigenbasis serves as a model of the target object. The dynamic model represents the motion of the object and defines possible locations of the target based upon previous locations. The observation model provides a measure of the distance of an observation of the object relative to the current Eigenbasis. The inference model predicts the most likely location of the object based upon past and present observations. The method is effective with or without training samples. A computer-based system provides a means for implementing the method. The effectiveness of the system and method are demonstrated through simulation.
摘要:
An online sparse matrix Gaussian process (OSMGP) uses online updates to provide an accurate and efficient regression for applications such as pose estimation and object tracking. A regression calculation module calculates a regression on a sequence of input images to generate output predictions based on a learned regression model. The regression model is efficiently updated by representing a covariance matrix of the regression model using a sparse matrix factor (e.g., a Cholesky factor). The sparse matrix factor is maintained and updated in real-time based on the output predictions. Hyperparameter optimization, variable reordering, and matrix downdating techniques can also be applied to further improve the accuracy and/or efficiency of the regression process.
摘要:
Methods and systems are described for three-dimensional pose estimation. A training module determines a mapping function between a training image sequence and pose representations of a subject in the training image sequence. The training image sequence is represented by a set of appearance and motion patches. A set of filters are applied to the appearance and motion patches to extract features of the training images. Based on the extracted features, the training module learns a multidimensional mapping function that maps the motion and appearance patches to the pose representations of the subject. A testing module outputs a fast human pose estimation by applying the learned mapping function to a test image sequence.
摘要:
A system and method recognizes and tracks human motion from different motion classes. In a learning stage, a discriminative model is learned to project motion data from a high dimensional space to a low dimensional space while enforcing discriminance between motions of different motion classes in the low dimensional space. Additionally, low dimensional data may be clustered into motion segments and motion dynamics learned for each motion segment. In a tracking stage, a representation of human motion is received comprising at least one class of motion. The tracker recognizes and tracks the motion based on the learned discriminative model and the learned dynamics.
摘要:
A system and a method model the motion of a non-rigid object using a thin plate spline (TPS) transform. A first image of a video sequence is received, and a region of interest, referred to as a template, is chosen manually or automatically. A set of arbitrarily-chosen fixed reference points is positioned on the template. A target image of the video sequence is chosen for motion estimation relative to the template. A set of pixels in the target image corresponding to the pixels of the template is determined, and this set of pixels is back-warped to match the template using a thin-plate-spline-based technique. The error between the template and the back-warped image is determined and iteratively minimized using a gradient descent technique. The TPS parameters can then be used to estimate the relative motion between the template and the corresponding region of the target image. According to one embodiment, a stiff-to-flexible approach mitigates instability that can arise when reference points lie in textureless regions, or when the initial TPS parameters are not close to the desired ones. The value of a regularization parameter is varied from a larger to a smaller value, varying the nature of the warp from stiff to flexible, so as to progressively emphasize local non-rigid deformations.
摘要:
A visual tracker tracks an object in a sequence of input images. A tracking module detects a location of the object based on a set of weighted blocks representing the object's shape. The tracking module then refines a segmentation of the object from the background image at the detected location. Based on the refined segmentation, the set of weighted blocks are updated. By adaptively encoding appearance and shape into the block configuration, the present invention is able to efficiently and accurately track an object even in the presence of rapid motion that causes large variations in appearance and shape of the object.
摘要:
An online sparse matrix Gaussian process (OSMGP) uses online updates to provide an accurate and efficient regression for applications such as pose estimation and object tracking. A regression calculation module calculates a regression on a sequence of input images to generate output predictions based on a learned regression model. The regression model is efficiently updated by representing a covariance matrix of the regression model using a sparse matrix factor (e.g., a Cholesky factor). The sparse matrix factor is maintained and updated in real-time based on the output predictions. Hyperparameter optimization, variable reordering, and matrix downdating techniques can also be applied to further improve the accuracy and/or efficiency of the regression process.