-
公开(公告)号:US11893717B2
公开(公告)日:2024-02-06
申请号:US17187080
申请日:2021-02-26
Applicant: Adobe Inc.
Inventor: Christopher Tensmeyer , Vlad Morariu , Michael Brodie
IPC: G06T5/50 , G06N3/08 , G06N3/04 , G06F18/214
CPC classification number: G06T5/50 , G06F18/214 , G06N3/04 , G06N3/08 , G06T2207/20081 , G06T2207/20084
Abstract: This disclosure describes one or more embodiments of systems, non-transitory computer-readable media, and methods that can learn or identify a learned-initialization-latent vector for an initialization digital image and reconstruct a target digital image using an image-generating-neural network based on a modified version of the learned-initialization-latent vector. For example, the disclosed systems learn a learned-initialization-latent vector from an initialization image utilizing a high number (e.g., thousands) of learning iterations on an image-generating-neural network (e.g., a GAN). Then, the disclosed systems can modify the learned-initialization-latent vector (of the initialization image) to generate modified or reconstructed versions of target images using the image-generating-neural network. For instance, the disclosed systems utilize the learned-initialization-latent vector as a starting point to learn a learned-latent vector for a target image that an image-generating-neural network converts into a high-fidelity reconstruction of the target image (with a reduced number of learning iterations).
-
公开(公告)号:US20240028972A1
公开(公告)日:2024-01-25
申请号:US17815448
申请日:2022-07-27
Applicant: Adobe Inc.
Inventor: Christopher Tensmeyer , Nikolaos Barmpalios , Sruthi Madapoosi Ravi , Ruchi Deshpande , Varun Manjunatha , Smitha Bangalore Naresh , Priyank Mathur , Oghenetegiri Sido
CPC classification number: G06N20/20 , G06K9/6262 , G06K9/6256
Abstract: Techniques for training for and determining a confidence of an output of a machine learning model are disclosed. Such techniques include, in some embodiments, receiving, from the machine learning model configured to receive information associated with a data object, information associated with a predicted structure for the data object; encoding, using a second machine learning model, the information associated with the predicted structure for the data object to produce encoded input channels; evaluating, using the second machine learning model, the information associated with the data object with the encoded input channels; and based on the evaluating, determining, using the second machine learning model, a probability of correctness of the predicted structure for the data object.
-
公开(公告)号:US20210312232A1
公开(公告)日:2021-10-07
申请号:US16885168
申请日:2020-05-27
Applicant: Adobe Inc.
Inventor: Christopher Tensmeyer , Vlad Ion Morariu , Varun Manjunatha , Tong Sun , Nikolaos Barmpalios , Kai Li , Handong Zhao , Curtis Wigington
Abstract: A domain alignment technique for cross-domain object detection tasks is introduced. During a preliminary pretraining phase, an object detection model is pretrained to detect objects in images associated with a source domain using a source dataset of images associated with the source domain. After completing the pretraining phase, a domain adaptation phase is performed using the source dataset and a target dataset to adapt the pretrained object detection model to detect objects in images associated with the target domain. The domain adaptation phase may involve the use of various domain alignment modules that, for example, perform multi-scale pixel/path alignment based on input feature maps or perform instance-level alignment based on input region proposals.
-
公开(公告)号:US12148119B2
公开(公告)日:2024-11-19
申请号:US17576091
申请日:2022-01-14
Applicant: Adobe Inc.
Inventor: Ruiyi Zhang , Yufan Zhou , Christopher Tensmeyer , Jiuxiang Gu , Tong Yu , Tong Sun
Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods that implement a neural network framework for interactive multi-round image generation from natural language inputs. Specifically, the disclosed systems provide an intelligent framework (i.e., a text-based interactive image generation model) that facilitates a multi-round image generation and editing workflow that comports with arbitrary input text and synchronous interaction. In particular embodiments, the disclosed systems utilize natural language feedback for conditioning a generative neural network that performs text-to-image generation and text-guided image modification. For example, the disclosed systems utilize a trained model to inject textual features from natural language feedback into a unified joint embedding space for generating text-informed style vectors. In turn, the disclosed systems can generate an image with semantically meaningful features that map to the natural language feedback. Moreover, the disclosed systems can persist these semantically meaningful features throughout a refinement process and across generated images.
-
公开(公告)号:US11880655B2
公开(公告)日:2024-01-23
申请号:US17724349
申请日:2022-04-19
Applicant: Adobe Inc.
Inventor: Christopher Tensmeyer , Danilo Neves Ribeiro , Varun Manjunatha , Nedim Lipka , Ani Nenkova
IPC: G06F40/226 , G06F40/284 , G06F40/30 , G06N20/00 , G06F16/2453 , G06N20/20
CPC classification number: G06F40/284 , G06F16/24535 , G06F40/226 , G06N20/20
Abstract: Embodiments are disclosed for performing fact correction of natural language sentences using data tables. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input sentence, tokenizing elements of the input sentence, and identifying, by a first machine learning model, a data table associated with the input sentence. The systems and methods further comprise a second machine learning model identifying a tokenized element of the input sentence that renders the input sentence false based on the data table and masking the tokenized element of the tokenized input sentence that renders the input sentence false. The systems and method further includes a third machine learning model predicting a new value for the masked tokenized element based on the input sentence with the masked tokenized element and the identified data table and providing an output including a modified input sentence with the new value.
-
公开(公告)号:US11544503B2
公开(公告)日:2023-01-03
申请号:US16885168
申请日:2020-05-27
Applicant: Adobe Inc.
Inventor: Christopher Tensmeyer , Vlad Ion Morariu , Varun Manjunatha , Tong Sun , Nikolaos Barmpalios , Kai Li , Handong Zhao , Curtis Wigington
Abstract: A domain alignment technique for cross-domain object detection tasks is introduced. During a preliminary pretraining phase, an object detection model is pretrained to detect objects in images associated with a source domain using a source dataset of images associated with the source domain. After completing the pretraining phase, a domain adaptation phase is performed using the source dataset and a target dataset to adapt the pretrained object detection model to detect objects in images associated with the target domain. The domain adaptation phase may involve the use of various domain alignment modules that, for example, perform multi-scale pixel/path alignment based on input feature maps or perform instance-level alignment based on input region proposals.
-
公开(公告)号:US20220277431A1
公开(公告)日:2022-09-01
申请号:US17187080
申请日:2021-02-26
Applicant: Adobe Inc.
Inventor: Christopher Tensmeyer , Vlad Morariu , Michael Brodie
Abstract: This disclosure describes one or more embodiments of systems, non-transitory computer-readable media, and methods that can learn or identify a learned-initialization-latent vector for an initialization digital image and reconstruct a target digital image using an image-generating-neural network based on a modified version of the learned-initialization-latent vector. For example, the disclosed systems learn a learned-initialization-latent vector from an initialization image utilizing a high number (e.g., thousands) of learning iterations on an image-generating-neural network (e.g., a GAN). Then, the disclosed systems can modify the learned-initialization-latent vector (of the initialization image) to generate modified or reconstructed versions of target images using the image-generating-neural network. For instance, the disclosed systems utilize the learned-initialization-latent vector as a starting point to learn a learned-latent vector for a target image that an image-generating-neural network converts into a high-fidelity reconstruction of the target image (with a reduced number of learning iterations).
-
公开(公告)号:US20230376687A1
公开(公告)日:2023-11-23
申请号:US17746779
申请日:2022-05-17
Applicant: ADOBE INC.
Inventor: Vlad Ion Morariu , Tong Sun , Nikolaos Barmpalios , Zilong Wang , Jiuxiang Gu , Ani Nenkova Nenkova , Christopher Tensmeyer
IPC: G06F40/279 , G06N5/02
CPC classification number: G06F40/279 , G06N5/022
Abstract: Embodiments are provided for facilitating multimodal extraction across multiple granularities. In one implementation, a set of features of a document for a plurality of granularities of the document is obtained. Via a machine learning model, the set of features of the document are modified to generate a set of modified features using a set of self-attention values to determine relationships within a first type of feature and a set of cross-attention values to determine relationships between the first type of feature and a second type of feature. Thereafter, the set of modified features are provided to a second machine learning model to perform a classification task.
-
9.
公开(公告)号:US20230230198A1
公开(公告)日:2023-07-20
申请号:US17576091
申请日:2022-01-14
Applicant: Adobe Inc.
Inventor: Ruiyi Zhang , Yufan Zhou , Christopher Tensmeyer , Jiuxiang Gu , Tong Yu , Tong Sun
CPC classification number: G06T3/0056 , G06T11/00 , G10L15/22 , G10L15/26 , G06N3/04 , G10L2015/223
Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods that implement a neural network framework for interactive multi-round image generation from natural language inputs. Specifically, the disclosed systems provide an intelligent framework (i.e., a text-based interactive image generation model) that facilitates a multi-round image generation and editing workflow that comports with arbitrary input text and synchronous interaction. In particular embodiments, the disclosed systems utilize natural language feedback for conditioning a generative neural network that performs text-to-image generation and text-guided image modification. For example, the disclosed systems utilize a trained model to inject textual features from natural language feedback into a unified joint embedding space for generating text-informed style vectors. In turn, the disclosed systems can generate an image with semantically meaningful features that map to the natural language feedback. Moreover, the disclosed systems can persist these semantically meaningful features throughout a refinement process and across generated images.
-
公开(公告)号:US20230153531A1
公开(公告)日:2023-05-18
申请号:US17528972
申请日:2021-11-17
Applicant: ADOBE INC.
Inventor: Shijie Geng , Christopher Tensmeyer , Curtis Michael Wigington , Jiuxiang Gu
IPC: G06F40/284 , G06N3/04 , G06F16/2452
CPC classification number: G06F40/284 , G06F16/24526 , G06N3/04
Abstract: Systems and methods for performing Document Visual Question Answering tasks are described. A document and query are received. The document encodes document tokens and the query encodes query tokens. The document is segmented into nested document sections, lines, and tokens. A nested structure of tokens is generated based on the segmented document. A feature vector for each token is generated. A graph structure is generated based on the nested structure of tokens. Each graph node corresponds to the query, a document section, a line, or a token. The node connections correspond to the nested structure. Each node is associated with the feature vector for the corresponding object. A graph attention network is employed to generate another embedding for each node. These embeddings are employed to identify a portion of the document that includes a response to the query. An indication of the identified portion of the document is be provided.
-
-
-
-
-
-
-
-
-