摘要:
Statistical approaches to large-scale image annotation are described. Generally, the annotation technique includes compiling visual features and textual information from a number of images, hashing the images visual features, and clustering the images based on their hash values. An example system builds statistical language models from the clustered images and annotates the image by applying one of the statistical language models.
摘要:
Systems and methods for bipartite graph reinforcement modeling to annotate web images are described. In one aspect the systems and methods implement bipartite graph reinforcement modeling operations to identify a set of annotations that are relevant to a Web image. The systems and methods annotate the Web image with the identified annotations. The systems and methods then index the annotated Web image. Responsive to receiving an image search query from a user, wherein the image search query comprises information relevant to at least a subset of the identified annotations, the image search engine service presents the annotated Web image to the user.
摘要:
Statistical approaches to large-scale image annotation are described. Generally, the annotation technique includes compiling visual features and textual information from a number of images, hashing the images visual features, and clustering the images based on their hash values. An example system builds statistical language models from the clustered images and annotates the image by applying one of the statistical language models.
摘要:
Statistical approaches to large-scale image annotation are described. Generally, the annotation technique includes compiling visual features and textual information from a number of images, hashing the images visual features, and clustering the images based on their hash values. An example system builds statistical language models from the clustered images and annotates the image by applying one of the statistical language models.
摘要:
Statistical approaches to large-scale image annotation are described. Generally, the annotation technique includes compiling visual features and textual information from a number of images, hashing the images visual features, and clustering the images based on their hash values. An example system builds statistical language models from the clustered images and annotates the image by applying one of the statistical language models.
摘要:
Systems and methods for bipartite graph reinforcement modeling to annotate web images are described. In one aspect the systems and methods implement bipartite graph reinforcement modeling operations to identify a set of annotations that are relevant to a Web image. The systems and methods annotate the Web image with the identified annotations. The systems and methods then index the annotated Web image. Responsive to receiving an image search query from a user, wherein the image search query comprises information relevant to at least a subset of the identified annotations, the image search engine service presents the annotated Web image to the user.
摘要:
Word correlations are estimated using a content-based method, which uses visual features of image representations of the words. The image representations of the subject words may be generated by retrieving images from data sources (such as the Internet) using image search with the subject words as query words. One aspect of the techniques is based on calculating the visual distance or visual similarity between the sets of retrieved images corresponding to each query word. The other is based on calculating the visual consistence among the set of the retrieved images corresponding to a conjunctive query word. The combination of the content-based method and a text-based method may produce even better result.
摘要:
An advertisement image classification system trains a binary classifier to classify images as advertisement images or non-advertisement images and then uses the binary classifier to classify images of web pages as advertisement images or non-advertisement images. During a training phase, the classification system generates training data of feature vectors representing the images and labels indicating whether an image is an advertisement image or a non-advertisement Image. The classification system trains a binary classifier to classify Images using training data. During a classification phase, the classification system inputs a web page with an image and generates a feature vector for the image. The classification system then applies the trained binary classifier to the feature vector to generate a score indicating whether the image is an advertisement image or a non-advertisement image.
摘要:
Improvements are provided to effectively assess a user's face and head pose such that a computer or like device can track the user's attention towards a display device(s). Then the region of the display or graphical user interface that the user is turned towards can be automatically selected without requiring the user to provide further inputs. A frontal face detector is applied to detect the user's frontal face and then key facial points such as left/right eye center, left/right mouth corner, nose tip, etc., are detected by component detectors. The system then tracks the user's head by an image tracker and determines yaw, tilt and roll angle and other pose information of the user's head through a coarse to fine process according to key facial points and/or confidence outputs by pose estimator.
摘要:
Face detection techniques are provided that use a multiple-stage face detection algorithm. An exemplary three-stage algorithm includes a first stage that applies linear-filtering to enhance detection performance by removing many non-face-like portions within an image, a second stage that uses a boosting chain that is adopted to combine boosting classifiers within a hierarchy “chain” structure, and a third stage that performs post-filtering using image pre-processing, SVM-filtering and color-filtering to refine the final face detection prediction. In certain further implementations, the face detection techniques include a two-level hierarchy in-plane pose estimator to provide a rapid multi-view face detector that further improves the accuracy and robustness of face detection.