Abstract:
Constraining memory use for overlapping virtual memory operations is described. The memory use is constrained to prevent memory from exceeding an operational threshold, e.g., in relation to operations for modifying content. These operations are implemented according to algorithms having a plurality of instructions. Before the instructions are performed in relation to the content, virtual memory is allocated to the content data, which is then loaded into the virtual memory and is also partitioned into data portions. In the context of the described techniques, at least one of the instructions affects multiple portions of the content data loaded in virtual memory. When this occurs, the instruction is carried out, in part, by transferring the multiple portions of content data between the virtual memory and a memory such that a number of portions of the content data in the memory is constrained to the memory that is reserved for the operation.
Abstract:
Font recognition and similarity determination techniques and systems are described. In a first example, localization techniques are described to train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention, e.g., without specifying any of the edges of a bounding box. In a second example, a deep neural network is directly learned as an embedding function of a model that is usable to determine font similarity. In a third example, techniques are described that leverage attributes described in metadata associated with fonts as part of font recognition and similarity determinations.
Abstract:
Techniques for image captioning with word vector representations are described. In implementations, instead of outputting results of caption analysis directly, the framework is adapted to output points in a semantic word vector space. These word vector representations reflect distance values in the context of the semantic word vector space. In this approach, words are mapped into a vector space and the results of caption analysis are expressed as points in the vector space that capture semantics between words. In the vector space, similar concepts with have small distance values. The word vectors are not tied to particular words or a single dictionary. A post-processing step is employed to map the points to words and convert the word vector representations to captions. Accordingly, conversion is delayed to a later stage in the process.
Abstract:
Techniques for image captioning with weak supervision are described herein. In implementations, weak supervision data regarding a target image is obtained and utilized to provide detail information that supplements global image concepts derived for image captioning. Weak supervision data refers to noisy data that is not closely curated and may include errors. Given a target image, weak supervision data for visually similar images may be collected from sources of weakly annotated images, such as online social networks. Generally, images posted online include “weak” annotations in the form of tags, titles, labels, and short descriptions added by users. Weak supervision data for the target image is generated by extracting keywords for visually similar images discovered in the different sources. The keywords included in the weak supervision data are then employed to modulate weights applied for probabilistic classifications during image captioning analysis.
Abstract:
Font recognition and similarity determination techniques and systems are described. In a first example, localization techniques are described to train a model using machine learning (e.g., a convolutional neural network) using training images. The model is then used to localize text in a subsequently received image, and may do so automatically and without user intervention, e.g., without specifying any of the edges of a bounding box. In a second example, a deep neural network is directly learned as an embedding function of a model that is usable to determine font similarity. In a third example, techniques are described that leverage attributes described in metadata associated with fonts as part of font recognition and similarity determinations.
Abstract:
Systems and techniques for identification of fonts include receiving a selection of an area of an image including text, where the selection is received from within an application. The selected area of the image is input to a font matching module within the application. The font matching module identifies one or more fonts similar to the text in the selected area using a convolutional neural network. The one or more fonts similar to the text are displayed within the application and the selection and use of the one or more fonts is enabled within the application.
Abstract:
Font recognition and similarity determination techniques and systems are described. For example, a computing device receives an image including a font and extracts font features corresponding to the font. The computing device computes font feature distances between the font and fonts from a set of training fonts. The computing device calculates, based on the font feature distances, similarity scores for the font and the training fonts used for calculating features distances. The computing device determines, based on the similarity scores, final similarity scores for the font relative to the training fonts.
Abstract:
Smoothing images using machine learning is described. In one or more embodiments, a machine learning system is trained using multiple training items. Each training item includes a boundary shape representation and a positional indicator. To generate the training item, a smooth image is downscaled to produce a corresponding blocky image that includes multiple blocks. For a given block, the boundary shape representation encodes a blocky boundary in a neighborhood around the given block. The positional indicator reflects a distance between the given block and a smooth boundary of the smooth image. In one or more embodiments to smooth a blocky image, a boundary shape representation around a selected block is determined. The representation is encoded as a feature vector and applied to the machine learning system to obtain a positional indicator. The positional indicator is used to compute a location of a smooth boundary of a smooth image.
Abstract:
A system and method for distributed similarity learning for high-dimensional image features are described. A set of data features is accessed. Subspaces from a space formed by the set of data features are determined using a set of projection matrices. Each subspace has a dimension lower than a dimension of the set of data features. Similarity functions are computed for the subspaces. Each similarity function is based on the dimension of the corresponding subspace. A linear combination of the similarity functions is performed to determine a similarity function for the set of data features.
Abstract:
Font replacement based on visual similarity is described. In one or more embodiments, a font descriptor includes multiple font features derived from a visual appearance of a font by a font visual similarity model. The font visual similarity model can be trained using a machine learning system that recognizes similarity between visual appearances of two different fonts. A source computing device embeds a font descriptor in a document, which is transmitted to a destination computing device. The destination compares the embedded font descriptor to font descriptors corresponding to local fonts. Based on distances between the embedded and the local font descriptors, at least one matching font descriptor is determined. The local font corresponding to the matching font descriptor is deemed similar to the original font. The destination computing device controls presentations of the document using the similar local font. Computation of font descriptors can be outsourced to a remote location.