TEXT-AUGMENTED OBJECT CENTRIC RELATIONSHIP DETECTION

    公开(公告)号:US20250095393A1

    公开(公告)日:2025-03-20

    申请号:US18470778

    申请日:2023-09-20

    Applicant: ADOBE INC.

    Abstract: A method, apparatus, and non-transitory computer readable medium for image processing are described. Embodiments of the present disclosure obtain an image and an input text including a subject from the image and a location of the subject in the image. An image encoder encodes the image to obtain an image embedding. A text encoder encodes the input text to obtain a text embedding. An image processing apparatus based on the present disclosure generates an output text based on the image embedding and the text embedding. In some examples, the output text includes a relation of the subject to an object from the image and a location of the object in the image.

Patent Agency Ranking