Training text recognition systems

    公开(公告)号:US11810374B2

    公开(公告)日:2023-11-07

    申请号:US17240097

    申请日:2021-04-26

    Applicant: Adobe Inc.

    Abstract: In implementations of recognizing text in images, text recognition systems are trained using noisy images that have nuisance factors applied, and corresponding clean images (e.g., without nuisance factors). Clean images serve as supervision at both feature and pixel levels, so that text recognition systems are trained to be feature invariant (e.g., by requiring features extracted from a noisy image to match features extracted from a clean image), and feature complete (e.g., by requiring that features extracted from a noisy image be sufficient to generate a clean image). Accordingly, text recognition systems generalize to text not included in training images, and are robust to nuisance factors. Furthermore, since clean images are provided as supervision at feature and pixel levels, training requires fewer training images than text recognition systems that are not trained with a supervisory clean image, thus saving time and resources.

    Digital content interaction prediction and training that addresses imbalanced classes

    公开(公告)号:US11676060B2

    公开(公告)日:2023-06-13

    申请号:US15002206

    申请日:2016-01-20

    Applicant: Adobe Inc.

    CPC classification number: G06N20/00 G06Q30/02

    Abstract: Digital content interaction prediction and training techniques that address imbalanced classes are described. In one or more implementations, a digital medium environment is described to predict user interaction with digital content that addresses an imbalance of numbers included in first and second classes in training data used to train a model using machine learning. The training data is received that describes the first class and the second class. A model is trained using machine learning. The training includes sampling the training data to include at least one subset of the training data from the first class and at least one subset of the training data from the second class. Iterative selections are made of a batch from the sampled training data. The iteratively selected batches are iteratively processed by a classifier implemented using machine learning to train the model.

    Generating responses to queries about videos utilizing a multi-modal neural network with attention

    公开(公告)号:US11615308B2

    公开(公告)日:2023-03-28

    申请号:US17563901

    申请日:2021-12-28

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media for generating a response to a question received from a user during display or playback of a video segment by utilizing a query-response-neural network. The disclosed systems can extract a query vector from a question corresponding to the video segment using the query-response-neural network. The disclosed systems further generate context vectors representing both visual cues and transcript cues corresponding to the video segment using context encoders or other layers from the query-response-neural network. By utilizing additional layers from the query-response-neural network, the disclosed systems generate (i) a query-context vector based on the query vector and the context vectors, and (ii) candidate-response vectors representing candidate responses to the question from a domain-knowledge base or other source. To respond to a user's question, the disclosed systems further select a response from the candidate responses based on a comparison of the query-context vector and the candidate-response vectors.

    DETERMINING FINE-GRAIN VISUAL STYLE SIMILARITIES FOR DIGITAL IMAGES BY EXTRACTING STYLE EMBEDDINGS DISENTANGLED FROM IMAGE CONTENT

    公开(公告)号:US20220092108A1

    公开(公告)日:2022-03-24

    申请号:US17025041

    申请日:2020-09-18

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately and flexibly identifying digital images with similar style to a query digital image using fine-grain style determination via weakly supervised style extraction neural networks. For example, the disclosed systems can extract a style embedding from a query digital image using a style extraction neural network such as a novel two-branch autoencoder architecture or a weakly supervised discriminative neural network. The disclosed systems can generate a combined style embedding by combining complementary style embeddings from different style extraction neural networks. Moreover, the disclosed systems can search a repository of digital images to identify digital images with similar style to the query digital image. The disclosed systems can also learn parameters for one or more style extraction neural network through weakly supervised training without a specifically labeled style ontology for sample digital images.

Patent Agency Ranking