-
公开(公告)号:US20210232850A1
公开(公告)日:2021-07-29
申请号:US16750478
申请日:2020-01-23
Applicant: Adobe Inc.
Inventor: Trung Huu Bui , Zhe Lin , Hao Tan , Franck Dernoncourt , Mohit Bansal
Abstract: In implementations of generating descriptions of image relationships, a computing device implements a description system which receives a source digital image and a target digital image. The description system generates a source feature sequence from the source digital image and a target feature sequence from the target digital image. A visual relationship between the source digital image and the target digital image is determined by using cross-attention between the source feature sequence and the target feature sequence. The system generates a description of a visual transformation between the source digital image and the target digital image based on the visual relationship.
-
12.
公开(公告)号:US20200372025A1
公开(公告)日:2020-11-26
申请号:US16420764
申请日:2019-05-23
Applicant: ADOBE INC.
Inventor: Seung-hyun Yoon , Franck Dernoncourt , Trung Huu Bui , Doo Soon Kim , Carl Iwan Dockhorn , Yu Gong
IPC: G06F16/2452 , G06F16/2457 , G06F16/28 , G06F16/248 , G06N20/00
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for techniques for identifying textual similarity and performing answer selection. A textual-similarity computing model can use a pre-trained language model to generate vector representations of a question and a candidate answer from a target corpus. The target corpus can be clustered into latent topics (or other latent groupings), and probabilities of a question or candidate answer being in each of the latent topics can be calculated and condensed (e.g., downsampled) to improve performance and focus on the most relevant topics. The condensed probabilities can be aggregated and combined with a downstream vector representation of the question (or answer) so the model can use focused topical and other categorical information as auxiliary information in a similarity computation. In training, transfer learning may be applied from a large-scale corpus, and the conventional list-wise approach can be replaced with point-wise learning.
-
公开(公告)号:US20190278835A1
公开(公告)日:2019-09-12
申请号:US15915775
申请日:2018-03-08
Applicant: Adobe Inc.
Inventor: Arman Cohan , Walter W. Chang , Trung Huu Bui , Franck Dernoncourt , Doo Soon Kim
Abstract: Techniques are disclosed for abstractive summarization process for summarizing documents, including long documents. A document is encoded using an encoder-decoder architecture with attentive decoding. In particular, an encoder for modeling documents generates both word-level and section-level representations of a document. A discourse-aware decoder then captures the information flow from all discourse sections of a document. In order to extend the robustness of the generated summarization, a neural attention mechanism considers both word-level as well as section-level representations of a document. The neural attention mechanism may utilize a set of weights that are applied to the word-level representations and section-level representations.
-
14.
公开(公告)号:US12061995B2
公开(公告)日:2024-08-13
申请号:US16813098
申请日:2020-03-09
Applicant: ADOBE INC.
Inventor: Trung Huu Bui , Tong Sun , Natwar Modani , Lidan Wang , Franck Dernoncourt
IPC: G06N7/01 , G06F40/205 , G06F40/279 , G06F40/30 , G06N20/00
CPC classification number: G06N7/01 , G06F40/205 , G06F40/279 , G06F40/30 , G06N20/00
Abstract: Methods for natural language semantic matching performed by training and using a Markov Network model are provided. The trained Markov Network model can be used to identify answers to questions. Training may be performed using question-answer pairs that include labels indicating a correct or incorrect answer to a question. The trained Markov Network model can be used to identify answers to questions from sources stored on a database. The Markov Network model provides superior performance over other semantic matching models, in particular, where the training data set includes a different information domain type relative to the input question or the output answer of the trained Markov Network model.
-
公开(公告)号:US11967128B2
公开(公告)日:2024-04-23
申请号:US17333583
申请日:2021-05-28
Applicant: ADOBE INC.
Inventor: Qiuyu Chen , Quan Hung Tran , Kushal Kafle , Trung Huu Bui , Franck Dernoncourt , Walter Chang
IPC: G06V10/56 , G06F16/51 , G06F16/532 , G06F16/56 , G06F16/583 , G06V10/25 , G06V10/774 , G06V10/82
CPC classification number: G06V10/56 , G06F16/51 , G06F16/532 , G06F16/56 , G06F16/5838 , G06V10/25 , G06V10/774 , G06V10/82 , G06T2207/20081
Abstract: The present disclosure describes a model for large scale color prediction of objects identified in images. Embodiments of the present disclosure include an object detection network, an attention network, and a color classification network. The object detection network generates object features for an object in an image and may include a convolutional neural network (CNN), region proposal network, or a ResNet. The attention network generates an attention vector for the object based on the object features, wherein the attention network takes a query vector based on the object features, and a plurality of key vector and a plurality of value vectors corresponding to a plurality of colors as input. The color classification network generates a color attribute vector based on the attention vector, wherein the color attribute vector indicates a probability of the object including each of the plurality of colors.
-
公开(公告)号:US20230403175A1
公开(公告)日:2023-12-14
申请号:US17806751
申请日:2022-06-14
Applicant: ADOBE INC.
Inventor: Tuan Manh Lai , Trung Huu Bui , Doo Soon Kim
IPC: H04L12/18 , G06F40/284 , G06N3/04
CPC classification number: H04L12/1831 , G06F40/284 , G06N3/04
Abstract: Systems and methods for coreference resolution are provided. One aspect of the systems and methods includes inserting a speaker tag into a transcript, wherein the speaker tag indicates that a name in the transcript corresponds to a speaker of a portion of the transcript; encoding a plurality of candidate spans from the transcript based at least in part on the speaker tag to obtain a plurality of span vectors; extracting a plurality of entity mentions from the transcript based on the plurality of span vectors, wherein each of the plurality of entity mentions corresponds to one of the plurality of candidate spans; and generating coreference information for the transcript based on the plurality of entity mentions, wherein the coreference information indicates that a pair of candidate spans of the plurality of candidate spans corresponds to a pair of entity mentions that refer to a same entity.
-
公开(公告)号:US20230267726A1
公开(公告)日:2023-08-24
申请号:US17651771
申请日:2022-02-18
Applicant: Adobe Inc.
Inventor: Seunghyun Yoon , Trung Huu Bui , Franck Dernoncourt , Hyounghun Kim , Doo Soon Kim
CPC classification number: G06V10/86 , G06V10/82 , G06V10/806 , G06V10/7715 , G06N3/088 , G06N3/0445 , G06F40/284
Abstract: Embodiments of the disclosure provide a machine learning model for generating a predicted executable command for an image. The learning model includes an interface configured to obtain an utterance indicating a request associated with the image, an utterance sub-model, a visual sub-model, an attention network, and a selection gate. The machine learning model generates a segment of the predicted executable command from weighted probabilities of each candidate token in a predetermined vocabulary determined based on the visual features, the concept features, current command features, and the utterance features extracted from the utterance or the image.
-
公开(公告)号:US20230153522A1
公开(公告)日:2023-05-18
申请号:US17455533
申请日:2021-11-18
Applicant: ADOBE INC.
Inventor: Jaemin Cho , Seunghyun Yoon , Ajinkya Gorakhnath Kale , Trung Huu Bui , Franck Dernoncourt
IPC: G06F40/253 , G06K9/62 , G06F16/583
CPC classification number: G06F40/253 , G06K9/6256 , G06K9/6262 , G06F16/583
Abstract: Systems and methods for image captioning are described. One or more aspects of the systems and methods include generating a training caption for a training image using an image captioning network; encoding the training caption using a multi-modal encoder to obtain an encoded training caption; encoding the training image using the multi-modal encoder to obtain an encoded training image; computing a reward function based on the encoded training caption and the encoded training image; and updating parameters of the image captioning network based on the reward function.
-
19.
公开(公告)号:US11651211B2
公开(公告)日:2023-05-16
申请号:US16717698
申请日:2019-12-17
Applicant: Adobe Inc.
Inventor: Tuan Manh Lai , Trung Huu Bui , Quan Hung Tran
CPC classification number: G06N3/08 , G06F40/284 , G06N3/045 , G10L15/16 , G10L25/30
Abstract: Techniques for training a first neural network (NN) model using a pre-trained second NN model are disclosed. In an example, training data is input to the first and second models. The training data includes masked tokens and unmasked tokens. In response, the first model generates a first prediction associated with a masked token and a second prediction associated with an unmasked token, and the second model generates a third prediction associated with the masked token and a fourth prediction associated with the unmasked token. The first model is trained, based at least in part on the first, second, third, and fourth predictions. In another example, a prediction associated with a masked token, a prediction associated with an unmasked token, and a prediction associated with whether two sentences of training data are adjacent sentences are received from each of the first and second models. The first model is trained using the predictions.
-
公开(公告)号:US20220383031A1
公开(公告)日:2022-12-01
申请号:US17333583
申请日:2021-05-28
Applicant: Adobe INC.
Inventor: Qiuyu Chen , Quan Hung Tran , Kushal Kafle , Trung Huu Bui , Franck Dernoncourt , Walter Chang
IPC: G06K9/46 , G06K9/32 , G06F16/51 , G06F16/583 , G06F16/532 , G06F16/56
Abstract: The present disclosure describes a model for large scale color prediction of objects identified in images. Embodiments of the present disclosure include an object detection network, an attention network, and a color classification network. The object detection network generates object features for an object in an image and may include a convolutional neural network (CNN), region proposal network, or a ResNet. The attention network generates an attention vector for the object based on the object features, wherein the attention network takes a query vector based on the object features, and a plurality of key vector and a plurality of value vectors corresponding to a plurality of colors as input. The color classification network generates a color attribute vector based on the attention vector, wherein the color attribute vector indicates a probability of the object including each of the plurality of colors.
-
-
-
-
-
-
-
-
-