PERTURBATION ROBUST METRIC FOR EVALUATING IMAGE CAPTIONS

    公开(公告)号:US20240304009A1

    公开(公告)日:2024-09-12

    申请号:US18179177

    申请日:2023-03-06

    Applicant: Adobe Inc.

    CPC classification number: G06V20/70 G06F40/58 G06T1/0021

    Abstract: Embodiments are disclosed for training an image caption evaluation system to perform evaluations of image captions. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a training image, a ground truth image caption for the training image, and a perturbed image caption for the training image, where the perturbed image caption includes modifications to the ground truth image caption. The disclosed systems and methods further comprise generating, by a visual encoder, a visual embedding representation of the training image and generating, by a perturbation-aware text encoder, a first text embedding for the ground truth image caption and a second text embedding for the perturbed image caption. The disclosed systems and methods further comprise computing losses between the visual embedding, the first text embedding, and the second text embedding and training the perturbation-aware text encoder based on the computed losses.

Patent Agency Ranking