摘要:
Improvements are provided to effectively assess a user's face and head pose such that a computer or like device can track the user's attention towards a display device(s). Then the region of the display or graphical user interface that the user is turned towards can be automatically selected without requiring the user to provide further inputs. A frontal face detector is applied to detect the user's frontal face and then key facial points such as left/right eye center, left/right mouth corner, nose tip, etc., are detected by component detectors. The system then tracks the user's head by an image tracker and determines yaw, tilt and roll angle and other pose information of the user's head through a coarse to fine process according to key facial points and/or confidence outputs by pose estimator.
摘要:
Improvements are provided to effectively assess a user's face and head pose such that a computer or like device can track the user's attention towards a display device(s). Then the region of the display or graphical user interface that the user is turned towards can be automatically selected without requiring the user to provide further inputs. A frontal face detector is applied to detect the user's frontal face and then key facial points such as left/right eye center, left/right mouth corner, nose tip, etc., are detected by component detectors. The system then tracks the user's head by an image tracker and determines yaw, tilt and roll angle and other pose information of the user's head through a coarse to fine process according to key facial points and/or confidence outputs by pose estimator.
摘要:
Systems and methods for shape registration are described. In one aspect, training shape vectors are generated from images in an image database. The training shape vectors identify landmark points associated with one or more object types. A distribution of shape in the training shape vectors is represented as a prior of tangent shape in tangent shape space. The prior of tangent shape is then incorporated into a unified Bayesian framework for shape registration.
摘要:
Systems and methods for annotating a face in a digital image are described. In one aspect, a probability model is trained by mapping one or more sets of sample facial features to corresponding names of individuals. A face from an input data set of at least one the digital image is then detected. Facial features are then automatically extracted from the detected face. A similarity measure is them modeled as a posterior probability that the facial features match a particular set of features identified in the probability model. The similarity measure is statistically learned. A name is then inferred as a function of the similarity measure. The face is then annotated with the name.
摘要:
Text features corresponding to pieces of media content (e.g., images, audio, multimedia content, etc.) are extracted from media content sources. One or more text features (e.g., one or more words) for a piece of media content are extracted from text associated with the piece of media content and text feature vectors generated therefrom and used during subsequent searching. Additional low-level feature vectors may also be extracted from the piece of media content and used during the subsequent searching. Relevance feedback can also be received from a user(s) identifying the relevance of pieces of media content rendered to the user in response to his or her search request. The relevance feedback is logged and can be used in determining how to respond to subsequent search requests, such as by modifying feature vectors (e.g., text feature vectors) corresponding to the pieces of media content for which relevance feedback is received.
摘要:
A process for comparing two digital images is described. The process includes comparing texture moment data for the two images to provide a similarity index, combining the similarity index with other data to provide a similarity value and determining that the two images match when the similarity value exceeds a first threshold value.
摘要:
Face detection techniques are provided that use a multiple-stage face detection algorithm. An exemplary three-stage algorithm includes a first stage that applies linear-filtering to enhance detection performance by removing many non-face-like portions within an image, a second stage that uses a boosting chain that is adopted to combine boosting classifiers within a hierarchy “chain” structure, and a third stage that performs post-filtering using image pre-processing, SVM-filtering and color-filtering to refine the final face detection prediction. In certain further implementations, the face detection techniques include a two-level hierarchy in-plane pose estimator to provide a rapid multi-view face detector that further improves the accuracy and robustness of face detection.
摘要:
Improvements are provided to effectively assess a user's face and head pose such that a computer or like device can track the user's attention towards a display device(s). Then the region of the display or graphical user interface that the user is turned towards can be automatically selected without requiring the user to provide further inputs. A frontal face detector is applied to detect the user's frontal face and then key facial points such as left/right eye center, left/right mouth corner, nose tip, etc., are detected by component detectors. The system then tracks the user's head by an image tracker and determines yaw, tilt and roll angle and other pose information of the user's head through a coarse to fine process according to key facial points and/or confidence outputs by pose estimator.
摘要:
Improvements are provided to effectively assess a user's face and head pose such that a computer or like device can track the user's attention towards a display device(s). Then the region of the display or graphical user interface that the user is turned towards can be automatically selected without requiring the user to provide further inputs. A frontal face detector is applied to detect the user's frontal face and then key facial points such as left/right eye center, left/right mouth corner, nose tip, etc., are detected by component detectors. The system then tracks the user's head by an image tracker and determines yaw, tilt and roll angle and other pose information of the user's head through a coarse to fine process according to key facial points and/or confidence outputs by pose estimator.
摘要:
The disclosed subject matter improves iterative results of content-based image retrieval (CBIR) using a bigram model to correlate relevance feedback. Specifically, multiple images are received responsive to multiple image search sessions. Relevance feedback is used to determine whether the received images are semantically relevant. A respective semantic correlation between each of at least one pair of the images is then estimated using respective bigram frequencies. The bigram frequencies are based on multiple search sessions in which each image of a pair of images is semantically relevant.