摘要:
Improvements are provided to effectively assess a user's face and head pose such that a computer or like device can track the user's attention towards a display device(s). Then the region of the display or graphical user interface that the user is turned towards can be automatically selected without requiring the user to provide further inputs. A frontal face detector is applied to detect the user's frontal face and then key facial points such as left/right eye center, left/right mouth corner, nose tip, etc., are detected by component detectors. The system then tracks the user's head by an image tracker and determines yaw, tilt and roll angle and other pose information of the user's head through a coarse to fine process according to key facial points and/or confidence outputs by pose estimator.
摘要:
A method and system for generating clusters of images for a search result of an image query is provided. When an original image query is received, the search system identifies text associated with the original image query by submitting the original image query to a search engine. The search system identifies phrases from the text of the web page containing the search result. The search system uses each of the identified phrases as an image query and submits the image queries to an image search engine. The search system considers the image search result for each image query to represent a cluster of related images. The search system then presents the clusters of images as the images of the image search result of the original image query.
摘要:
Systems and methods for shape registration are described. In one aspect, training shape vectors are generated from images in an image database. The training shape vectors identify landmark points associated with one or more object types. A distribution of shape in the training shape vectors is represented as a prior of tangent shape in tangent shape space. The prior of tangent shape is then incorporated into a unified Bayesian framework for shape registration.
摘要:
A method for the joint optimization of language model performance and size is presented comprising developing a language model from a tuning set of information, segmenting at least a subset of a received textual corpus and calculating a perplexity value for each segment and refining the language model with one or more segments of the received corpus based, at least in part, on the calculated perplexity value for the one or more segments.
摘要:
Systems and methods for annotating a face in a digital image are described. In one aspect, a probability model is trained by mapping one or more sets of sample facial features to corresponding names of individuals. A face from an input data set of at least one the digital image is then detected. Facial features are then automatically extracted from the detected face. A similarity measure is them modeled as a posterior probability that the facial features match a particular set of features identified in the probability model. The similarity measure is statistically learned. A name is then inferred as a function of the similarity measure. The face is then annotated with the name.
摘要:
A method and system for propagating the relevance of labeled documents to a query to unlabeled documents is provided. The propagation system provides training data that includes queries, documents labeled with their relevance to the queries, and unlabeled documents. The propagation system then calculates the similarity between pairs of documents in the training data. The propagation system then propagates the relevance of the labeled documents to similar, but unlabeled, documents. The propagation system may iteratively propagate labels of the documents until the labels converge on a solution. The training data with the propagated relevances can then be used to train a ranking function.
摘要:
A method and apparatus are provided for adapting a language model to a task-specific domain. Under the method and apparatus, the relative frequency of n-grams in a small training set (i.e. task-specific training data set) and the relative frequency of n-grams in a large training set (i.e. out-of-domain training data set) are used to weight a distribution count of n-grams in the large training set. The weighted distributions are then used to form a modified language model by identifying probabilities for n-grams from the weighted distributions.
摘要:
Text features corresponding to pieces of media content (e.g., images, audio, multimedia content, etc.) are extracted from media content sources. One or more text features (e.g., one or more words) for a piece of media content are extracted from text associated with the piece of media content and text feature vectors generated therefrom and used during subsequent searching. Additional low-level feature vectors may also be extracted from the piece of media content and used during the subsequent searching. Relevance feedback can also be received from a user(s) identifying the relevance of pieces of media content rendered to the user in response to his or her search request. The relevance feedback is logged and can be used in determining how to respond to subsequent search requests, such as by modifying feature vectors (e.g., text feature vectors) corresponding to the pieces of media content for which relevance feedback is received.
摘要:
Systems and methods for learning-based automatic commercial content detection are described. In one aspect, program data is divided into multiple segments. The segments are analyzed to determine visual, audio, and context-based feature sets that differentiate commercial content from non-commercial content. The context-based features are a function of single-side left and/or right neighborhoods of segments of the multiple segments.
摘要:
A process for comparing two digital images is described. The process includes comparing texture moment data for the two images to provide a similarity index, combining the similarity index with other data to provide a similarity value and determining that the two images match when the similarity value exceeds a first threshold value.