MULTIMODAL VIDEO SUMMARIZATION
    1.
    发明申请

    公开(公告)号:US20240404283A1

    公开(公告)日:2024-12-05

    申请号:US18328597

    申请日:2023-06-02

    Applicant: Adobe Inc.

    Abstract: A method includes receiving a video input and a text transcription of the video input. The video input includes a plurality of frames and the text transcription includes a plurality of sentences. The method further includes determining, by a multimodal summarization model, a subset of key frames of the plurality of frames and a subset of key sentences of the plurality of sentences. The method further includes providing a summary of the video input and a summary of the text transcription based on the subset of key frames and the subset of key sentences.

    PERTURBATION ROBUST METRIC FOR EVALUATING IMAGE CAPTIONS

    公开(公告)号:US20240304009A1

    公开(公告)日:2024-09-12

    申请号:US18179177

    申请日:2023-03-06

    Applicant: Adobe Inc.

    CPC classification number: G06V20/70 G06F40/58 G06T1/0021

    Abstract: Embodiments are disclosed for training an image caption evaluation system to perform evaluations of image captions. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a training image, a ground truth image caption for the training image, and a perturbed image caption for the training image, where the perturbed image caption includes modifications to the ground truth image caption. The disclosed systems and methods further comprise generating, by a visual encoder, a visual embedding representation of the training image and generating, by a perturbation-aware text encoder, a first text embedding for the ground truth image caption and a second text embedding for the perturbed image caption. The disclosed systems and methods further comprise computing losses between the visual embedding, the first text embedding, and the second text embedding and training the perturbation-aware text encoder based on the computed losses.

    METHODS AND SYSTEMS FOR DETERMINING CHARACTERISTICS OF A DIALOG BETWEEN A COMPUTER AND A USER

    公开(公告)号:US20210375277A1

    公开(公告)日:2021-12-02

    申请号:US16889669

    申请日:2020-06-01

    Applicant: Adobe Inc.

    Abstract: A computer-implemented method is disclosed for determining one or more characteristics of a dialog between a computer system and user. The method may comprise receiving a system utterance comprising one or more tokens defining one or more words generated by the computer system; receiving a user utterance comprising one or more tokens defining one or more words uttered by a user in response to the system utterance, the system utterance and the user utterance forming a dialog context; receiving one or more utterance candidates comprising one or more tokens; for each utterance candidate, generating an input sequence combining the one or more tokens of each of the system utterance, the user utterance, and the utterance candidate; and for each utterance candidate, evaluating the generated input sequence with a model to determine a probability that the utterance candidate is relevant to the dialog context.

Patent Agency Ranking