PERSONALIZED TEXT-TO-IMAGE GENERATION
    1.
    发明公开

    公开(公告)号:US20240355022A1

    公开(公告)日:2024-10-24

    申请号:US18476504

    申请日:2023-09-28

    Applicant: ADOBE INC.

    CPC classification number: G06T11/60 G06T7/194 G06T9/00 G06T2207/20081

    Abstract: One or more aspects of a method, apparatus, and non-transitory computer readable medium include obtaining an input description and an input image depicting a subject, encoding the input description using a text encoder of an image generation model to obtain a text embedding, and encoding the input image using a subject encoder of the image generation model to obtain a subject embedding. A guidance embedding is generated by combining the subject embedding and the text embedding, and then an output image is generated based on the guidance embedding using a diffusion model of the image generation model. The output image depicts aspects of the subject and the input description.

    GENERATING IMAGE DIFFERENCE CAPTIONS VIA AN IMAGE-TEXT CROSS-MODAL NEURAL NETWORK

    公开(公告)号:US20250131753A1

    公开(公告)日:2025-04-24

    申请号:US18489681

    申请日:2023-10-18

    Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for generating difference captions indicating detected differences in digital image pairs. The disclosed system generates a first feature map of a first digital image and a second feature map of a second digital image. The disclosed system converts, utilizing a linear projection neural network, the first feature map to a first modified feature map in a feature space corresponding to a large language machine-learning model. The disclosed system also converts, utilizing the linear projection neural network layer, the second feature map to a second modified feature map in the feature space corresponding to the large language machine-learning model. The disclosed system further generates, utilizing the large language machine-learning model, a difference caption indicating a difference between the first digital image and the second digital image from a combination of the first modified feature map and the second modified feature map.

    GENERATING COLOR-EDITED DIGITAL IMAGES UTILIZING A CONTENT AWARE DIFFUSION NEURAL NETWORK

    公开(公告)号:US20250046055A1

    公开(公告)日:2025-02-06

    申请号:US18363980

    申请日:2023-08-02

    Applicant: Adobe Inc.

    Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that trains (and utilizes) an image color editing diffusion neural network to generate a color edited digital image(s) for a digital image. In particular, in one or more implementations, the disclosed systems identify a digital image depicting content in a first color style. Moreover, the disclosed systems generate, from the digital image utilizing an image color editing diffusion neural network, a color-edited digital image depicting the content in a second color style different from the first color style. Further, the disclosed systems provide, for display within a graphical user interface, the color-edited digital image.

    Performing global image editing using editing operations determined from natural language requests

    公开(公告)号:US11570318B2

    公开(公告)日:2023-01-31

    申请号:US17374103

    申请日:2021-07-13

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize a neural network having a long short-term memory encoder-decoder architecture to progressively modify a digital image in accordance with a natural language request. For example, in one or more embodiments, the disclosed systems utilize a language-to-operation decoding cell of a language-to-operation neural network to sequentially determine one or more image-modification operations to perform to modify a digital image in accordance with a natural language request. In some cases, the decoding cell determines an image-modification operation to perform partly based on the previously used image-modification operations. The disclosed systems further utilize the decoding cell to determine one or more operation parameters for each selected image-modification operation. The disclosed systems utilize the image-modification operation(s) and operation parameter(s) to modify the digital image (e.g., by generating one or more modified digital images) via the decoding cell.

    Generating semantic scene graphs from ungrounded label graphs and visual graphs for digital images

    公开(公告)号:US11989923B2

    公开(公告)日:2024-05-21

    申请号:US17483126

    申请日:2021-09-23

    Applicant: Adobe Inc.

    Inventor: Ning Xu Jing Shi

    CPC classification number: G06V10/426 G06F18/217 G06F18/22 G06F40/205 G06N3/02

    Abstract: This disclosure describes methods, non-transitory computer readable storage media, and systems that utilize weakly supervised graph matching to align an ungrounded label graph and a visual graph corresponding to a digital image. Specifically, the disclosed system utilizes a label embedding model to generate label graph embeddings from the ungrounded label graph and a visual embedding network to generate visual graph embeddings from the visual graph. Additionally, the disclosed system determines similarity metrics indicating the similarity of pairs of label graph embeddings and visual graph embeddings. The disclosed system then generates a semantic scene graph by utilizing a graph matching algorithm to align the ungrounded label graph and the visual graph based on the similarity metrics. In some embodiments, the disclosed system utilizes contrastive learning to modify the embedding models. Furthermore, in additional embodiments, the disclosed system utilizes the semantic scene graph to train a scene graph generation neural network.

    PERFORMING GLOBAL IMAGE EDITING USING EDITING OPERATIONS DETERMINED FROM NATURAL LANGUAGE REQUESTS

    公开(公告)号:US20220399017A1

    公开(公告)日:2022-12-15

    申请号:US17374103

    申请日:2021-07-13

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize a neural network having a long short-term memory encoder-decoder architecture to progressively modify a digital image in accordance with a natural language request. For example, in one or more embodiments, the disclosed systems utilize a language-to-operation decoding cell of a language-to-operation neural network to sequentially determine one or more image-modification operations to perform to modify a digital image in accordance with a natural language request. In some cases, the decoding cell determines an image-modification operation to perform partly based on the previously used image-modification operations. The disclosed systems further utilize the decoding cell to determine one or more operation parameters for each selected image-modification operation. The disclosed systems utilize the image-modification operation(s) and operation parameter(s) to modify the digital image (e.g., by generating one or more modified digital images) via the decoding cell.

Patent Agency Ranking