Harmonizing composite images utilizing a transformer neural network

    公开(公告)号:US12165284B2

    公开(公告)日:2024-12-10

    申请号:US17655663

    申请日:2022-03-21

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods that implement a dual-branched neural network architecture to harmonize composite images. For example, in one or more implementations, the transformer-based harmonization system uses a convolutional branch and a transformer branch to generate a harmonized composite image based on an input composite image and a corresponding segmentation mask. More particularly, the convolutional branch comprises a series of convolutional neural network layers followed by a style normalization layer to extract localized information from the input composite image. Further, the transformer branch comprises a series of transformer neural network layers to extract global information based on different resolutions of the input composite image. Utilizing a decoder, the transformer-based harmonization system combines the local information and the global information from the corresponding convolutional branch and transformer branch to generate a harmonized composite image.

    VOICE INTERACTION FOR IMAGE EDITING
    7.
    发明申请

    公开(公告)号:US20200175975A1

    公开(公告)日:2020-06-04

    申请号:US16205126

    申请日:2018-11-29

    Applicant: Adobe Inc.

    Abstract: This application relates generally to modifying visual data based on audio commands and more specifically, to performing complex operations that modify visual data based on one or more audio commands. In some embodiments, a computer system may receive an audio input and identify an audio command based on the audio input. The audio command may be mapped to one or more operations capable of being performed by a multimedia editing application. The computer system may perform the one or more operations to edit to received multimedia data.

    Voice interaction for image editing

    公开(公告)号:US11257491B2

    公开(公告)日:2022-02-22

    申请号:US16205126

    申请日:2018-11-29

    Applicant: Adobe Inc.

    Abstract: This application relates generally to modifying visual data based on audio commands and more specifically, to performing complex operations that modify visual data based on one or more audio commands. In some embodiments, a computer system may receive an audio input and identify an audio command based on the audio input. The audio command may be mapped to one or more operations capable of being performed by a multimedia editing application. The computer system may perform the one or more operations to edit to received multimedia data.

Patent Agency Ranking