TEXT CONDITIONED IMAGE SEARCH BASED ON DUAL-DISENTANGLED FEATURE COMPOSITION

    公开(公告)号:US20220237406A1

    公开(公告)日:2022-07-28

    申请号:US17160862

    申请日:2021-01-28

    Applicant: Adobe Inc.

    Abstract: Techniques are disclosed for text conditioned image searching. A methodology implementing the techniques according to an embodiment includes receiving a source image and a text query defining a target image attribute. The method also includes decomposing the source image into image content and style feature vectors and decomposing the text query into text content and style feature vectors, wherein image style is descriptive of image content and text style is descriptive of text content. The method further includes composing a global content feature vector based on the text content feature vector and the image content feature vector and composing a global style feature vector based on the text style feature vector and the image style feature vector. The method further includes identifying a target image that relates to the global content feature vector and the global style feature vector so that the target image relates to the target image attribute.

    TEXT-CONDITIONED IMAGE SEARCH BASED ON TRANSFORMATION, AGGREGATION, AND COMPOSITION OF VISIO-LINGUISTIC FEATURES

    公开(公告)号:US20220245391A1

    公开(公告)日:2022-08-04

    申请号:US17160893

    申请日:2021-01-28

    Applicant: Adobe Inc.

    Abstract: Techniques are disclosed for text-conditioned image searching. A methodology implementing the techniques includes decomposing a source image into visual feature vectors associated with different levels of granularity. The method also includes decomposing a text query (defining a target image attribute) into feature vectors associated with different levels of granularity including a global text feature vector. The method further includes generating image-text embeddings based on the visual feature vectors and the text feature vectors to encode information from visual and textual features. The method further includes composing a visio-linguistic representation based on a hierarchical aggregation of the image-text embeddings to encode visual and textual information at multiple levels of granularity. The method further includes identifying a target image that includes the visio-linguistic representation and the global text feature vector, so that the target image relates to the target image attribute, and providing the target image as an image search result.

Patent Agency Ranking