Abstract:
A convolutional neural network (CNN) is trained for font recognition and font similarity learning. In a training phase, text images with font labels are synthesized by introducing variances to minimize the gap between the training images and real-world text images. Training images are generated and input into the CNN. The output is fed into an N-way softmax function dependent on the number of fonts the CNN is being trained on, producing a distribution of classified text images over N class labels. In a testing phase, each test image is normalized in height and squeezed in aspect ratio resulting in a plurality of test patches. The CNN averages the probabilities of each test patch belonging to a set of fonts to obtain a classification. Feature representations may be extracted and utilized to define font similarity between fonts, which may be utilized in font suggestion, font browsing, or font recognition applications.
Abstract:
Methods, apparatus, and computer-readable storage media for patch-based image synthesis using color and color gradient voting. A patch matching technique provides an extended patch search space that encompasses geometric and photometric transformations, as well as color and color gradient domain features. The photometric transformations may include gain and bias. The patch-based image synthesis techniques may also integrate image color and color gradients into the patch representation and replace conventional color averaging with a technique that performs voting for colors and color gradients and then solves a screened Poisson equation based on values for colors and color gradients when blending patch(es) with a target image.
Abstract:
A first image at a first resolution is received, the first image having a first hole therein. Based on the first image, a second image is generated at a second resolution lower than the first resolution, the second image having a second hole therein corresponding to the first hole. In the second image, one or more second-image source patches for the second hole are identified. At least one first-image source patch in the first image is identified based on a location of the identified second-image source patch. The identified at least one first-image source patch are stored in memory. Fill content are identified in the at least one first-image source patch stored in the memory. The identified fill content are placed in the first hole.
Abstract:
A system may be configured as an image recognition machine that utilizes an image feature representation called local feature embedding (LFE). LFE enables generation of a feature vector that captures salient visual properties of an image to address both the fine-grained aspects and the coarse-grained aspects of recognizing a visual pattern depicted in the image. Configured to utilize image feature vectors with LFE, the system may implement a nearest class mean (NCM) classifier, as well as a scalable recognition algorithm with metric learning and max margin template selection. Accordingly, the system may be updated to accommodate new classes with very little added computational cost. This may have the effect of enabling the system to readily handle open-ended image classification problems.
Abstract:
Image classification techniques are described for adjustment of an image. In one or more implementations, an image is classified by one or more computing device based on suitability of the image for adjustment to correct perspective distortion of the image. Responsive to a classification of the image as not suitable for the adjustment, suitability of the image is detected for processing by a different image adjustment technique by the one or more computing devices.
Abstract:
Image distractor detection and processing techniques are described. In one or more implementations, a digital medium environment is configured for image distractor detection that includes detecting one or more locations within the image automatically and without user intervention by the one or more computing devices that include one or more distractors that are likely to be considered by a user as distracting from content within the image. The detection includes forming a plurality of segments from the image by the one or more computing devices and calculating a score for each of the plurality of segments that is indicative of a relative likelihood that a respective said segment is considered a distractor within the image. The calculation is performed using a distractor model trained using machine learning as applied to a plurality images having ground truth distractor locations.
Abstract:
Techniques for controlling patch-usage in image synthesis are described. In implementations, a curve is fitted to a set of sorted matching errors that correspond to potential source-to-target patch assignments between a source image and a target image. Then, an error budget is determined using the curve. In an example, the error budget is usable to identify feasible patch assignments from the potential source-to-target patch assignments. Using the error budget along with uniform patch-usage enforcement, source patches from the source image are assigned to target patches in the target image. Then, at least one of the assigned source patches is assigned to an additional target patch based on the error budget. Subsequently, an image is synthesized based on the source patches assigned to the target patches.
Abstract:
Techniques for focal length warping are described. Focal length warping, for instance, may provide an automated approach for correcting distortion in an input image to improve its perceptual quality. In at least some implementations, a focal length of a camera lens used to capture an image and an estimated camera distance are utilized to three-dimensionally reproject and warp the image to generate an adjusted image simulating a new focal length and a new camera distance. Implementations of focal length warping may estimate a camera distance based on facial features in an image.
Abstract:
A healing component that heals foreground pixels with background pixels is provided. In some embodiments, the healing component is programmed or otherwise configured to respond to a single healing request by identifying a plurality of regions within a selected area and healing each region of the plurality of regions independently of other regions.
Abstract:
Techniques and apparatus for automatic upright adjustment of digital images. An automatic upright adjustment technique is described that may provide an automated approach for straightening up slanted features in an input image to improve its perceptual quality. This correction may be referred to as upright adjustment. A set of criteria based on human perception may be used in the upright adjustment. A reprojection technique that implements an optimization framework is described that yields an optimal homography for adjustment based on the criteria and adjusts the image according to new camera parameters generated by the optimization. An optimization-based camera calibration technique is described that simultaneously estimates vanishing lines and points as well as camera parameters for an image; the calibration technique may, for example, be used to generate estimates of camera parameters and vanishing points and lines that are input to the reprojection technique.