Abstract:
Content creation collection and navigation techniques and systems are described. In one example, a representative image is used by a content sharing service to interact with a collection of images provided as part of a search result. In another example, a user interface image navigation control is configured to support user navigation through images based on one or more metrics. In a further example, a user interface image navigation control is configured to support user navigation through images based on one or more metrics identified for an object selected from the image. In yet another example, collections of images are leveraged as part of content creation. In another example, data obtained from a content sharing service is leveraged to indicate suitability of images of a user for licensing as part of the service.
Abstract:
Accelerating object detection techniques are described. In one or more implementations, adaptive sampling techniques are used to extract features from an image. Coarse features are extracted from the image and used to generate an object probability map. Then, dense features are extracted from high-probability object regions of the image identified in the object probability map to enable detection of an object in the image. In one or more implementations, cascade object detection techniques are used to detect an object in an image. In a first stage, exemplars in a first subset of exemplars are applied to features extracted from the multiple regions of the image to detect object candidate regions. Then, in one or more validation stages, the object candidate regions are validated by applying exemplars from the first subset of exemplars and one or more additional subsets of exemplars.
Abstract:
Techniques for facial expression capture for character animation are described. In one or more implementations, facial key points are identified in a series of images. Each image, in the series of images, is normalized from the identified facial key points. Facial features are determined from each of the normalized images. Then a facial expression is classified, based on the determined facial features, for each of the normalized images. In additional implementations, a series of images are captured that include performances of one or more facial expressions. The facial expressions in each image of the series of images are classified by a facial expression classifier. Then the facial expression classifications are used by a character animator system to produce a series of animated images of an animated character that include animated facial expressions that are associated with the facial expression classification of the corresponding image in the series of images.
Abstract:
Semantic class localization techniques and systems are described. In one or more implementation, a technique is employed to back communicate relevancies of aggregations back through layers of a neural network. Through use of these relevancies, activation relevancy maps are created that describe relevancy of portions of the image to the classification of the image as corresponding to a semantic class. In this way, the semantic class is localized to portions of the image. This may be performed through communication of positive and not negative relevancies, use of contrastive attention maps to different between semantic classes and even within a same semantic class through use of a self-contrastive technique.
Abstract:
Different candidate windows in an image are identified, such as by sliding a rectangular or other geometric shape of different sizes over an image to identify portions of the image (groups of pixels in the image). The candidate windows are analyzed by a set of convolutional neural networks, which are cascaded so that the input of one convolutional neural network layer is based on the input of another convolutional neural network layer. Each convolutional neural network layer drops or rejects one or more candidate windows that the convolutional neural network layer determines does not include an object (e.g., a face). The candidate windows that are identified as including an object (e.g., a face) are analyzed by another one of the convolutional neural network layers. The candidate windows identified by the last of the convolutional neural network layers are the indications of the objects (e.g., faces) in the image.
Abstract:
Content creation collection and navigation techniques and systems are described. In one example, a representative image is used by a content sharing service to interact with a collection of images provided as part of a search result. In another example, a user interface image navigation control is configured to support user navigation through images based on one or more metrics. In a further example, a user interface image navigation control is configured to support user navigation through images based on one or more metrics identified for an object selected from the image. In yet another example, collections of images are leveraged as part of content creation. In another example, data obtained from a content sharing service is leveraged to indicate suitability of images of a user for licensing as part of the service.
Abstract:
Content creation collection and navigation techniques and systems are described. In one example, a representative image is used by a content sharing service to interact with a collection of images provided as part of a search result. In another example, a user interface image navigation control is configured to support user navigation through images based on one or more metrics. In a further example, a user interface image navigation control is configured to support user navigation through images based on one or more metrics identified for an object selected from the image. In yet another example, collections of images are leveraged as part of content creation. In another example, data obtained from a content sharing service is leveraged to indicate suitability of images of a user for licensing as part of the service.
Abstract:
Feature interpolation techniques are described. In a training stage, features are extracted from a collection of training images and quantized into visual words. Spatial configurations of the visual words in the training images are determined and stored in a spatial configuration database. In an object detection stage, a portion of features of an image are extracted from the image and quantized into visual words. Then, a remaining portion of the features of the image are interpolated using the visual words and the spatial configurations of visual words stored in the spatial configuration database.
Abstract:
Image classification techniques using images with separate grayscale and color channels are described. In one or more implementations, an image classification network includes grayscale filters and color filters which are separate from the grayscale filters. The grayscale filters are configured to extract grayscale features from a grayscale channel of an image, and the color filters are configured to extract color features from a color channel of the image. The extracted grayscale features and color features are used to identify an object in the image, and the image is classified based on the identified object.
Abstract:
Cascaded object detection techniques are described. In one or more implementations, cascaded coarse-to-dense object detection techniques are utilized to detect objects in images. In a first stage, coarse features are extracted from an image, and non-object regions are rejected. Then, in one or more subsequent stages, dense features are extracted from the remaining non-rejected regions of the image to detect one or more objects in the image.