Abstract:
The invention is directed towards segmenting images based on natural language phrases. An image and an n-gram, including a sequence of tokens, are received. An encoding of image features and a sequence of token vectors are generated. A fully convolutional neural network identifies and encodes the image features. A word embedding model generates the token vectors. A recurrent neural network (RNN) iteratively updates a segmentation map based on combinations of the image feature encoding and the token vectors. The segmentation map identifies which pixels are included in an image region referenced by the n-gram. A segmented image is generated based on the segmentation map. The RNN may be a convolutional multimodal RNN. A separate RNN, such as a long short-term memory network, may iteratively update an encoding of semantic features based on the order of tokens. The first RNN may update the segmentation map based on the semantic feature encoding.
Abstract:
Imaging process initialization techniques are described. In an implementation, a color estimate is generated for a plurality of pixels within a region of an image. A plurality of pixels outside of the regions are first identified for each pixel of the plurality of pixels within the region. This may include identification of pixels disposed at opposing directions from the pixel being estimated. A color estimate is determined for each of the plurality of pixels based on the identified pixels. As part of this, a weighting may be employed, such as based on a respective distance of each of the pixels outside of the region to the pixel within the region, a distance along the opposing direction for corresponding pixels outside of the region (e.g., at horizontal or vertical directions), and so forth. The color estimate is then used to initialize an imaging process technique.
Abstract:
Systems and methods are provided for providing learned, piece-wise patch regression for image enhancement. In one embodiment, an image manipulation application generates training patch pairs that include training input patches and training output patches. Each training patch pair includes a respective training input patch from a training input image and a respective training output patch from a training output image. The training input image and the training output image include at least some of the same image content. The image manipulation application determines patch-pair functions from at least some of the training patch pairs. Each patch-pair function corresponds to a modification to a respective training input patch to generate a respective training output patch. The image manipulation application receives an input image generates an output image from the input image by applying at least some of the patch-pair functions based on at least some input patches of the input image.
Abstract:
The invention is directed towards segmenting images based on natural language phrases. An image and an n-gram, including a sequence of tokens, are received. An encoding of image features and a sequence of token vectors are generated. A fully convolutional neural network identifies and encodes the image features. A word embedding model generates the token vectors. A recurrent neural network (RNN) iteratively updates a segmentation map based on combinations of the image feature encoding and the token vectors. The segmentation map identifies which pixels are included in an image region referenced by the n-gram. A segmented image is generated based on the segmentation map. The RNN may be a convolutional multimodal RNN. A separate RNN, such as a long short-term memory network, may iteratively update an encoding of semantic features based on the order of tokens. The first RNN may update the segmentation map based on the semantic feature encoding.
Abstract:
In some embodiments, techniques for synthesizing an image style based on a plurality of neural networks are described. A computer system selects a style image based on user input that identifies the style image. The computer system generates an image based on a generator neural network and a loss neural network. The generator neural network outputs the synthesized image based on a noise vector and the style image and is trained based on style features generated from the loss neural network. The loss neural network outputs the style features based on a training image. The training image and the style image have a same resolution. The style features are generated at different resolutions of the training image. The computer system provides the synthesized image to a user device in response to the user input.
Abstract:
Patch partition and image processing techniques are described. In one or more implementations, a system includes one or more modules implemented at least partially in hardware. The one or more modules are configured to perform operations including grouping a plurality of patches taken from a plurality of training samples of images into respective ones of a plurality of partitions, calculating an image processing operator for each of the partitions, determining distances between the plurality of partitions that describe image similarity of patches of the plurality of partitions, one to another, and configuring a database to provide the determined distance and the image processing operator to process an image in response to identification of a respective partition that corresponds to a patch taken from the image.
Abstract:
Neural network image curation techniques are described. In one or more implementations, curation is controlled of images that represent a repository of images. A plurality of images of the repository are curated by one or more computing devices to select representative images of the repository. The curation includes calculating a score based on image and face aesthetics, jointly, for each of the plurality of images through processing by a neural network, ranking the plurality of images based on respective said scores, and selecting one or more of the plurality of images as one of the representative images of the repository based on the ranking and a determination that the one or more said images are not visually similar to images that have already been selected as one of the representative images of the repository.
Abstract:
Neural network patch aggregation and statistical techniques are described. In one or more implementations, patches are generated from an image, e.g., randomly, and used to train a neural network. An aggregation of outputs of patches processed by the neural network may be used to label an image using an image descriptor, such as to label aesthetics of the image, classify the image, and so on. In another example, the patches may be used by the neural network to calculate statistics describing the patches, such as to describe statistics such as minimum, maximum, median, and average of activations of image characteristics of the individual patches. These statistics may also be used to support a variety of functionality, such as to label the image as described above.
Abstract:
In techniques for adaptive denoising with internal and external patches, example image patches taken from example images are grouped into partitions of similar patches, and a partition center patch is determined for each of the partitions. An image denoising technique is applied to image patches of a noisy image to generate modified image patches, and a closest partition center patch to each of the modified image patches is determined. The image patches of the noisy image are then classified as either a common patch or a complex patch of the noisy image, where an image patch is classified based on a distance between the corresponding modified image patch and the closest partition center patch. A denoising operator can be applied to an image patch based on the classification, such as applying respective denoising operators to denoise the image patches that are classified as the common patches of the noisy image.
Abstract:
Systems and methods are provided for providing learned, piece-wise patch regression for image enhancement. In one embodiment, an image manipulation application generates training patch pairs that include training input patches and training output patches. Each training patch pair includes a respective training input patch from a training input image and a respective training output patch from a training output image. The training input image and the training output image include at least some of the same image content. The image manipulation application determines patch-pair functions from at least some of the training patch pairs. Each patch-pair function corresponds to a modification to a respective training input patch to generate a respective training output patch. The image manipulation application receives an input image generates an output image from the input image by applying at least some of the patch-pair functions based on at least some input patches of the input image.