Initializing a learned latent vector for neural-network projections of diverse images

    公开(公告)号:US11893717B2

    公开(公告)日:2024-02-06

    申请号:US17187080

    申请日:2021-02-26

    Applicant: Adobe Inc.

    Abstract: This disclosure describes one or more embodiments of systems, non-transitory computer-readable media, and methods that can learn or identify a learned-initialization-latent vector for an initialization digital image and reconstruct a target digital image using an image-generating-neural network based on a modified version of the learned-initialization-latent vector. For example, the disclosed systems learn a learned-initialization-latent vector from an initialization image utilizing a high number (e.g., thousands) of learning iterations on an image-generating-neural network (e.g., a GAN). Then, the disclosed systems can modify the learned-initialization-latent vector (of the initialization image) to generate modified or reconstructed versions of target images using the image-generating-neural network. For instance, the disclosed systems utilize the learned-initialization-latent vector as a starting point to learn a learned-latent vector for a target image that an image-generating-neural network converts into a high-fidelity reconstruction of the target image (with a reduced number of learning iterations).

    DOMAIN ALIGNMENT FOR OBJECT DETECTION DOMAIN ADAPTATION TASKS

    公开(公告)号:US20210312232A1

    公开(公告)日:2021-10-07

    申请号:US16885168

    申请日:2020-05-27

    Applicant: Adobe Inc.

    Abstract: A domain alignment technique for cross-domain object detection tasks is introduced. During a preliminary pretraining phase, an object detection model is pretrained to detect objects in images associated with a source domain using a source dataset of images associated with the source domain. After completing the pretraining phase, a domain adaptation phase is performed using the source dataset and a target dataset to adapt the pretrained object detection model to detect objects in images associated with the target domain. The domain adaptation phase may involve the use of various domain alignment modules that, for example, perform multi-scale pixel/path alignment based on input feature maps or perform instance-level alignment based on input region proposals.

    Utilizing a generative neural network to interactively create and modify digital images based on natural language feedback

    公开(公告)号:US12148119B2

    公开(公告)日:2024-11-19

    申请号:US17576091

    申请日:2022-01-14

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods that implement a neural network framework for interactive multi-round image generation from natural language inputs. Specifically, the disclosed systems provide an intelligent framework (i.e., a text-based interactive image generation model) that facilitates a multi-round image generation and editing workflow that comports with arbitrary input text and synchronous interaction. In particular embodiments, the disclosed systems utilize natural language feedback for conditioning a generative neural network that performs text-to-image generation and text-guided image modification. For example, the disclosed systems utilize a trained model to inject textual features from natural language feedback into a unified joint embedding space for generating text-informed style vectors. In turn, the disclosed systems can generate an image with semantically meaningful features that map to the natural language feedback. Moreover, the disclosed systems can persist these semantically meaningful features throughout a refinement process and across generated images.

    Fact correction of natural language sentences using data tables

    公开(公告)号:US11880655B2

    公开(公告)日:2024-01-23

    申请号:US17724349

    申请日:2022-04-19

    Applicant: Adobe Inc.

    CPC classification number: G06F40/284 G06F16/24535 G06F40/226 G06N20/20

    Abstract: Embodiments are disclosed for performing fact correction of natural language sentences using data tables. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input sentence, tokenizing elements of the input sentence, and identifying, by a first machine learning model, a data table associated with the input sentence. The systems and methods further comprise a second machine learning model identifying a tokenized element of the input sentence that renders the input sentence false based on the data table and masking the tokenized element of the tokenized input sentence that renders the input sentence false. The systems and method further includes a third machine learning model predicting a new value for the masked tokenized element based on the input sentence with the masked tokenized element and the identified data table and providing an output including a modified input sentence with the new value.

    INITIALIZING A LEARNED LATENT VECTOR FOR NEURAL-NETWORK PROJECTIONS OF DIVERSE IMAGES

    公开(公告)号:US20220277431A1

    公开(公告)日:2022-09-01

    申请号:US17187080

    申请日:2021-02-26

    Applicant: Adobe Inc.

    Abstract: This disclosure describes one or more embodiments of systems, non-transitory computer-readable media, and methods that can learn or identify a learned-initialization-latent vector for an initialization digital image and reconstruct a target digital image using an image-generating-neural network based on a modified version of the learned-initialization-latent vector. For example, the disclosed systems learn a learned-initialization-latent vector from an initialization image utilizing a high number (e.g., thousands) of learning iterations on an image-generating-neural network (e.g., a GAN). Then, the disclosed systems can modify the learned-initialization-latent vector (of the initialization image) to generate modified or reconstructed versions of target images using the image-generating-neural network. For instance, the disclosed systems utilize the learned-initialization-latent vector as a starting point to learn a learned-latent vector for a target image that an image-generating-neural network converts into a high-fidelity reconstruction of the target image (with a reduced number of learning iterations).

    UTILIZING A GENERATIVE NEURAL NETWORK TO INTERACTIVELY CREATE AND MODIFY DIGITAL IMAGES BASED ON NATURAL LANGUAGE FEEDBACK

    公开(公告)号:US20230230198A1

    公开(公告)日:2023-07-20

    申请号:US17576091

    申请日:2022-01-14

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods that implement a neural network framework for interactive multi-round image generation from natural language inputs. Specifically, the disclosed systems provide an intelligent framework (i.e., a text-based interactive image generation model) that facilitates a multi-round image generation and editing workflow that comports with arbitrary input text and synchronous interaction. In particular embodiments, the disclosed systems utilize natural language feedback for conditioning a generative neural network that performs text-to-image generation and text-guided image modification. For example, the disclosed systems utilize a trained model to inject textual features from natural language feedback into a unified joint embedding space for generating text-informed style vectors. In turn, the disclosed systems can generate an image with semantically meaningful features that map to the natural language feedback. Moreover, the disclosed systems can persist these semantically meaningful features throughout a refinement process and across generated images.

    ENHANCED DOCUMENT VISUAL QUESTION ANSWERING SYSTEM VIA HIERARCHICAL ATTENTION

    公开(公告)号:US20230153531A1

    公开(公告)日:2023-05-18

    申请号:US17528972

    申请日:2021-11-17

    Applicant: ADOBE INC.

    CPC classification number: G06F40/284 G06F16/24526 G06N3/04

    Abstract: Systems and methods for performing Document Visual Question Answering tasks are described. A document and query are received. The document encodes document tokens and the query encodes query tokens. The document is segmented into nested document sections, lines, and tokens. A nested structure of tokens is generated based on the segmented document. A feature vector for each token is generated. A graph structure is generated based on the nested structure of tokens. Each graph node corresponds to the query, a document section, a line, or a token. The node connections correspond to the nested structure. Each node is associated with the feature vector for the corresponding object. A graph attention network is employed to generate another embedding for each node. These embeddings are employed to identify a portion of the document that includes a response to the query. An indication of the identified portion of the document is be provided.

Patent Agency Ranking