Generating Descriptions of Image Relationships

    公开(公告)号:US20210232850A1

    公开(公告)日:2021-07-29

    申请号:US16750478

    申请日:2020-01-23

    Applicant: Adobe Inc.

    Abstract: In implementations of generating descriptions of image relationships, a computing device implements a description system which receives a source digital image and a target digital image. The description system generates a source feature sequence from the source digital image and a target feature sequence from the target digital image. A visual relationship between the source digital image and the target digital image is determined by using cross-attention between the source feature sequence and the target feature sequence. The system generates a description of a visual transformation between the source digital image and the target digital image based on the visual relationship.

    ANSWER SELECTION USING A COMPARE-AGGREGATE MODEL WITH LANGUAGE MODEL AND CONDENSED SIMILARITY INFORMATION FROM LATENT CLUSTERING

    公开(公告)号:US20200372025A1

    公开(公告)日:2020-11-26

    申请号:US16420764

    申请日:2019-05-23

    Applicant: ADOBE INC.

    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for techniques for identifying textual similarity and performing answer selection. A textual-similarity computing model can use a pre-trained language model to generate vector representations of a question and a candidate answer from a target corpus. The target corpus can be clustered into latent topics (or other latent groupings), and probabilities of a question or candidate answer being in each of the latent topics can be calculated and condensed (e.g., downsampled) to improve performance and focus on the most relevant topics. The condensed probabilities can be aggregated and combined with a downstream vector representation of the question (or answer) so the model can use focused topical and other categorical information as auxiliary information in a similarity computation. In training, transfer learning may be applied from a large-scale corpus, and the conventional list-wise approach can be replaced with point-wise learning.

    ABSTRACTIVE SUMMARIZATION OF LONG DOCUMENTS USING DEEP LEARNING

    公开(公告)号:US20190278835A1

    公开(公告)日:2019-09-12

    申请号:US15915775

    申请日:2018-03-08

    Applicant: Adobe Inc.

    Abstract: Techniques are disclosed for abstractive summarization process for summarizing documents, including long documents. A document is encoded using an encoder-decoder architecture with attentive decoding. In particular, an encoder for modeling documents generates both word-level and section-level representations of a document. A discourse-aware decoder then captures the information flow from all discourse sections of a document. In order to extend the robustness of the generated summarization, a neural attention mechanism considers both word-level as well as section-level representations of a document. The neural attention mechanism may utilize a set of weights that are applied to the word-level representations and section-level representations.

    SYSTEMS AND METHODS FOR COREFERENCE RESOLUTION

    公开(公告)号:US20230403175A1

    公开(公告)日:2023-12-14

    申请号:US17806751

    申请日:2022-06-14

    Applicant: ADOBE INC.

    CPC classification number: H04L12/1831 G06F40/284 G06N3/04

    Abstract: Systems and methods for coreference resolution are provided. One aspect of the systems and methods includes inserting a speaker tag into a transcript, wherein the speaker tag indicates that a name in the transcript corresponds to a speaker of a portion of the transcript; encoding a plurality of candidate spans from the transcript based at least in part on the speaker tag to obtain a plurality of span vectors; extracting a plurality of entity mentions from the transcript based on the plurality of span vectors, wherein each of the plurality of entity mentions corresponds to one of the plurality of candidate spans; and generating coreference information for the transcript based on the plurality of entity mentions, wherein the coreference information indicates that a pair of candidate spans of the plurality of candidate spans corresponds to a pair of entity mentions that refer to a same entity.

    IMAGE CAPTIONING
    18.
    发明公开
    IMAGE CAPTIONING 审中-公开

    公开(公告)号:US20230153522A1

    公开(公告)日:2023-05-18

    申请号:US17455533

    申请日:2021-11-18

    Applicant: ADOBE INC.

    CPC classification number: G06F40/253 G06K9/6256 G06K9/6262 G06F16/583

    Abstract: Systems and methods for image captioning are described. One or more aspects of the systems and methods include generating a training caption for a training image using an image captioning network; encoding the training caption using a multi-modal encoder to obtain an encoded training caption; encoding the training image using the multi-modal encoder to obtain an encoded training image; computing a reward function based on the encoded training caption and the encoded training image; and updating parameters of the image captioning network based on the reward function.

    Training of neural network based natural language processing models using dense knowledge distillation

    公开(公告)号:US11651211B2

    公开(公告)日:2023-05-16

    申请号:US16717698

    申请日:2019-12-17

    Applicant: Adobe Inc.

    CPC classification number: G06N3/08 G06F40/284 G06N3/045 G10L15/16 G10L25/30

    Abstract: Techniques for training a first neural network (NN) model using a pre-trained second NN model are disclosed. In an example, training data is input to the first and second models. The training data includes masked tokens and unmasked tokens. In response, the first model generates a first prediction associated with a masked token and a second prediction associated with an unmasked token, and the second model generates a third prediction associated with the masked token and a fourth prediction associated with the unmasked token. The first model is trained, based at least in part on the first, second, third, and fourth predictions. In another example, a prediction associated with a masked token, a prediction associated with an unmasked token, and a prediction associated with whether two sentences of training data are adjacent sentences are received from each of the first and second models. The first model is trained using the predictions.

    DECOMPOSITIONAL LEARNING FOR COLOR ATTRIBUTE PREDICTION

    公开(公告)号:US20220383031A1

    公开(公告)日:2022-12-01

    申请号:US17333583

    申请日:2021-05-28

    Applicant: Adobe INC.

    Abstract: The present disclosure describes a model for large scale color prediction of objects identified in images. Embodiments of the present disclosure include an object detection network, an attention network, and a color classification network. The object detection network generates object features for an object in an image and may include a convolutional neural network (CNN), region proposal network, or a ResNet. The attention network generates an attention vector for the object based on the object features, wherein the attention network takes a query vector based on the object features, and a plurality of key vector and a plurality of value vectors corresponding to a plurality of colors as input. The color classification network generates a color attribute vector based on the attention vector, wherein the color attribute vector indicates a probability of the object including each of the plurality of colors.

Patent Agency Ranking