摘要:
Techniques are disclosed that involve face detection. For instance, face detection tasks may be decomposed into sets of one or more sub-tasks. In turn the sub-tasks of the sets may be allocated across multiple image frames. This allocation may be based on a multiple layer, quad-tree approach. In addition, face tracking tasks may be performed.
摘要:
Detecting facial landmarks in a face detected in an image may be performed by first cropping a face rectangle region of the detected face in the image and generating an integral image based at least in part on the face rectangle region. Next, a cascade classifier may be executed for each facial landmark of the face rectangle region to produce one response image for each facial landmark based at least in part on the integral image. A plurality of Active Shape Model (ASM) initializations may be set up. ASM searching may be performed for each of the ASM initializations based at least in part on the response images, each ASM search resulting in a search result having a cost. Finally, a search result of the ASM searches having a lowest cost function may be selected, the selected search result indicating locations of the facial landmarks in the image.
摘要:
Methods, apparatuses, and articles associated with facial tracking and recognition are disclosed. In embodiments, facial images may be detected in video or still images and tracked. After normalization of the facial images, feature data may be extracted from selected regions of the faces to compare to associated feature data in known faces. The selected regions may be determined using a boosting machine learning processes over a set of known images. After extraction, individual two-class comparisons may be performed between corresponding feature data from regions on the tested facial images and from the known facial image. The individual two-class classifications may then be combined to determine a similarity score for the tested face and the known face. If the similarity score exceeds a threshold, an identification of the known face may be output or otherwise used. Additionally, tracking with voting may be performed on faces detected in video. After a threshold of votes is reached, a given tracked face may be associated with a known face.
摘要:
Apparatuses, methods and storage medium associated with processing an image are disclosed herein. In embodiments, a method for processing one or more images may include generating a plurality of pairs of keypoint features for a pair of images. Each pair of keypoint features may include a keypoint feature from each image. Further, for each pair of keypoint features, corresponding adjoin features may be generated. Additionally, for each pair of keypoint features, whether the adjoin features are similar may be determined. Whether the pair of images have at least one similar object may also be determined, based at least in part on a result of the determination of similarity between the corresponding adjoin features. Other embodiments may be described and claimed.
摘要:
Video editing methods and systems, including methods and systems to identify video clips having similar visual characteristics. Video clips may correspond to first and second videos, which may include a professional music video and a personal video, respectively. Identified video clips of the personal video may be combined into a new video clip, and music corresponding to visually similar video clips of the music video may be associated with the corresponding video clips of the new video. Video frames of the video clips may be characterized with respect to one or more visual features, which may include one or more of facial and/or body features, salient objects, camera motion, and image quality. Characterizations may be compared between video clips on an incremental basis. Characterization of a music video may implicitly model an underlying correlation between music rhythm and changes in visual appearance.
摘要:
Machine-readable media, methods, apparatus and system for obtaining and processing image features are described. In some embodiments, groups of training features derived from regions of training images may be trained to obtain a plurality of classifiers, each classifier corresponding to each group of training features. The plurality of classifiers may be used to classify groups of validation features derived from regions of validation images to obtain a plurality of weights, wherein each weight corresponds to each region of the validation images and indicates how important the each region of the validation images is. Then, a weight may be discarded from the plurality of weights based upon a certain criterion.
摘要:
A method for generating a Markov stationary color (MSC) descriptor is disclosed. The MSC descriptor may be used for image/video content representation, which characterizes both intra-color and inter-color spatial relationships in images. The MSC descriptor has a low storage requirement, relative to some other color descriptors.
摘要:
Machine-readable media, methods, apparatus and system for obtaining and processing image features are described. In some embodiments, a Gabor representation of an image may be obtained by using a Gabor filter. A region may be determined from the Gabor representation, wherein the region comprises a plurality of Gabor pixels of the Gabor representation; and, a sub-region may be determined from the region, wherein the sub-region comprises more than one of the plurality of Gabor pixels. Then, a Gabor feature may be calculated based upon a magnitude calculation related to the sub-region and the region.
摘要:
Apparatuses, systems, and computer program products that detect and/or index characters of videos are disclosed. One or more embodiments comprise an apparatus an apparatus having a feature extraction module and a cast indexing module. The feature extraction module may extract features of a scale invariant feature transform (SIFT) for face sets of a video and the cast indexing module may detect one or more characters of the video via one or more associations of clusters of the features. Some alternative embodiments may include a cast ranking module to sort characters of the video, considering such factors as appearance times of the characters, appearance frequencies of the characters, and page rankings of the characters. The apparatus may associate or partition the clusters based on a normalized cut process, as well as detect the characters based on measures of distances of nodes associated with the features. Numerous embodiments may detect the characters based upon partitioning the clusters via solutions for eigenvalue systems for matrices of nodes of the clusters.
摘要:
Methods and apparatus are disclosed to index digital frames. An example method includes identifying channel types associated with a plurality of image frames, splitting each one of the plurality of image frames into a respective color channel based on the identified channel types, applying a local binary pattern to each of the respective color channels to generate a respective pattern number, generating a spatial representation of each respective pattern number to determine transition probabilities for each channel type, and identifying a degree of similarity between the plurality of image frames based on the transition probabilities.