Abstract:
Methods and apparatus for disparity map correction through statistical analysis on local neighborhoods. A disparity map correction technique may be used to correct mistakes in a disparity or depth map. The disparity map correction technique may detect and mark invalid pixel pairs in a disparity map, segment the image, and perform a statistical analysis of the disparities in each segment to identify outliers. The invalid and outlier pixels may then be corrected using other disparity values in the local neighborhood. Multiple iterations of the disparity map correction technique may be performed to further improve the output disparity map.
Abstract:
User input-based object selection using multiple visual cues is described. User selection input is received for selecting a portion of an image. Once the user selection input is received, one of a plurality of visual cues that convey different information about content depicted in the image is selected for each pixel. The one visual cue is selected as a basis for identifying the pixel as part of the selected portion of the image or part of an unselected remainder of the image. The visual cues are selected by determining confidences, based in part on the user selection input, that the plurality of visual cues can be used to discriminate whether the pixel is part of the selected portion or part of the remainder. The information conveyed by the selected visual cues is used to identify the pixels as part of the selected portion or part of the remainder.
Abstract:
Techniques and systems are described to support digital image processing through use of an image repository, e.g., a stock image database or other storage. In one example, a plurality of candidate digital images are obtained from an image repository based on a target digital image. A plurality of transformations are generated to be applied to the target digital image, each transformation based on a respective candidate digital image. Semantic information is employed as part of the transformations, e.g., blending, filtering, or alignment. A plurality of transformed target digital images are generated based at least in part through application of the plurality of transformations to the target image.
Abstract:
Techniques and systems are described to support digital image processing through use of an image repository, e.g., a stock image database or other storage. In one example, a plurality of candidate digital images are obtained from an image repository based on a target digital image. A plurality of transformations are generated to be applied to the target digital image, each transformation based on a respective candidate digital image. Semantic information is employed as part of the transformations, e.g., blending, filtering, or alignment. A plurality of transformed target digital images are generated based at least in part through application of the plurality of transformations to the target image.
Abstract:
Probabilistic determination of selected image portions is described. In one or more implementations, a selection input is received for selecting a portion of an image. For pixels of the image that correspond to the selection input, probabilities are determined that the pixels are intended to be included as part of a selected portion of the image. In particular, the probability that a given pixel is intended to be included as part of the selected portion of the image is determined as a function of position relative to center pixels of the selection input as well as a difference in one or more visual characteristics with the center pixels. The determined probabilities can then be used to segment the selected portion of the image from a remainder of the image. Based on the segmentation of the selected portion from the remainder of the image, selected portion data can be generated that defines the selected portion of the image.
Abstract:
Joint depth estimation and semantic labeling techniques usable for processing of a single image are described. In one or more implementations, global semantic and depth layouts are estimated of a scene of the image through machine learning by the one or more computing devices. Local semantic and depth layouts are also estimated for respective ones of a plurality of segments of the scene of the image through machine learning by the one or more computing devices. The estimated global semantic and depth layouts are merged with the local semantic and depth layouts by the one or more computing devices to semantically label and assign a depth value to individual pixels in the image.
Abstract:
Techniques and systems are described to model and extract knowledge from images. A digital medium environment is configured to learn and use a model to compute a descriptive summarization of an input image automatically and without user intervention. Training data is obtained to train a model using machine learning in order to generate a structured image representation that serves as the descriptive summarization of an input image. The images and associated text are processed to extract structured semantic knowledge from the text, which is then associated with the images. The structured semantic knowledge is processed along with corresponding images to train a model using machine learning such that the model describes a relationship between text features within the structured semantic knowledge. Once the model is learned, the model is usable to process input images to generate a structured image representation of the image.
Abstract:
Image depth inference techniques and systems from semantic labels are described. In one or more implementations, a digital medium environment includes one or more computing devices to control a determination of depth within an image. Regions of the image are semantically labeled by the one or more computing devices. At least one of the semantically labeled regions is decomposed into a plurality of segments formed as planes generally perpendicular to a ground plane of the image. Depth of one or more of the plurality of segments is then inferred based on relationships of respective segments with respective locations of the ground plane of the image. A depth map is formed that describes depth for the at least one semantically labeled region based at least in part on the inferred depths for the one or more of the plurality of segments.
Abstract:
Stereoscopic target region filling techniques are described. Techniques are described in which stereo consistency is promoted between target regions, such as by sharing information during computation. Techniques are also described in which target regions of respective disparity maps are completed to promote consistency between the disparity maps. This estimated disparity may then be used as a guide to completion of a missing texture in the target region. Techniques are further described in which cross-image searching and matching is employed by leveraging a plurality of images. This may including giving preference to matches with cross-image consistency to promote consistency, thereby enforcing stereo consistency between stereo images when applicable.
Abstract:
Image cropping suggestion using multiple saliency maps is described. In one or more implementations, component scores, indicative of visual characteristics established for visually-pleasing croppings, are computed for candidate image croppings using multiple different saliency maps. The visual characteristics on which a candidate image cropping is scored may be indicative of its composition quality, an extent to which it preserves content appearing in the scene, and a simplicity of its boundary. Based on the component scores, the croppings may be ranked with regard to each of the visual characteristics. The rankings may be used to cluster the candidate croppings into groups of similar croppings, such that croppings in a group are different by less than a threshold amount and croppings in different groups are different by at least the threshold amount. Based on the clustering, croppings may then be chosen, e.g., to present them to a user for selection.