UTILIZING IMPLICIT NEURAL REPRESENTATIONS TO PARSE VISUAL COMPONENTS OF SUBJECTS DEPICTED WITHIN VISUAL CONTENT

    公开(公告)号:US20240378912A1

    公开(公告)日:2024-11-14

    申请号:US18316617

    申请日:2023-05-12

    Applicant: Adobe Inc.

    Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that utilize a local implicit image function neural network to perform image segmentation with a continuous class label probability distribution. For example, the disclosed systems utilize a local-implicit-image-function (LIIF) network to learn a mapping from an image to its semantic label space. In some instances, the disclosed systems utilize an image encoder to generate an image vector representation from an image. Subsequently, in one or more implementations, the disclosed systems utilize the image vector representation with a LIIF network decoder that generates a continuous probability distribution in a label space for the image to create a semantic segmentation mask for the image. Moreover, in some embodiments, the disclosed systems utilize the LIIF-based segmentation network to generate segmentation masks at different resolutions without changes in an input resolution of the segmentation network.

    Self-supervised hierarchical event representation learning

    公开(公告)号:US11948358B2

    公开(公告)日:2024-04-02

    申请号:US17455126

    申请日:2021-11-16

    Applicant: ADOBE INC.

    CPC classification number: G06V20/41 G06N3/088 G06V20/47 G06V20/44

    Abstract: Systems and methods for video processing are described. Embodiments of the present disclosure generate a plurality of image feature vectors corresponding to a plurality of frames of a video; generate a plurality of low-level event representation vectors based on the plurality of image feature vectors, wherein a number of the low-level event representation vectors is less than a number of the image feature vectors; generate a plurality of high-level event representation vectors based on the plurality of low-level event representation vectors, wherein a number of the high-level event representation vectors is less than the number of the low-level event representation vectors; and identify a plurality of high-level events occurring in the video based on the plurality of high-level event representation vectors.

    Performing semantic segmentation of form images using deep learning

    公开(公告)号:US11593552B2

    公开(公告)日:2023-02-28

    申请号:US15927686

    申请日:2018-03-21

    Applicant: Adobe Inc.

    Inventor: Mausoom Sarkar

    Abstract: The present disclosure relates to generating fillable digital forms corresponding to paper forms using a form conversion neural network to determine low-level and high-level semantic characteristics of the paper forms. For example, one or more embodiments applies a digitized paper form to an encoder that outputs feature maps to a reconstruction decoder, a low-level semantic decoder, and one or more high-level semantic decoders. The reconstruction decoder generates a reconstructed layout of the digitized paper form. The low-level and high-level semantic decoders determine low-level and high-level semantic characteristics of each pixel of the digitized paper form, which provide a probability of the element type to which the pixel belongs. The semantic decoders then classify each pixel and generate corresponding semantic segmentation maps based on those probabilities. The system then generates a fillable digital form using the reconstructed layout and the semantic segmentation maps.

    Introspection network for training neural networks

    公开(公告)号:US10755199B2

    公开(公告)日:2020-08-25

    申请号:US15608517

    申请日:2017-05-30

    Applicant: Adobe Inc.

    Abstract: An introspection network is a machine-learned neural network that accelerates training of other neural networks. The introspection network receives a weight history for each of a plurality of weights from a current training step for a target neural network. A weight history includes at least four values for the weight that are obtained during training of the target neural network up to the current step. The introspection network then provides, for each of the plurality of weights, a respective predicted value, based on the weight history. The predicted value for a weight represents a value for the weight in a future training step for the target neural network. Thus, the predicted value represents a jump in the training steps of the target neural network, which reduces the training time of the target neural network. The introspection network then sets each of the plurality of weights to its respective predicted value.

    PERFORMING SEMANTIC SEGMENTATION OF FORM IMAGES USING DEEP LEARNING

    公开(公告)号:US20190294661A1

    公开(公告)日:2019-09-26

    申请号:US15927686

    申请日:2018-03-21

    Applicant: Adobe Inc.

    Inventor: Mausoom Sarkar

    Abstract: The present disclosure relates to generating fillable digital forms corresponding to paper forms using a form conversion neural network to determine low-level and high-level semantic characteristics of the paper forms. For example, one or more embodiments applies a digitized paper form to an encoder that outputs feature maps to a reconstruction decoder, a low-level semantic decoder, and one or more high-level semantic decoders. The reconstruction decoder generates a reconstructed layout of the digitized paper form. The low-level and high-level semantic decoders determine low-level and high-level semantic characteristics of each pixel of the digitized paper form, which provide a probability of the element type to which the pixel belongs. The semantic decoders then classify each pixel and generate corresponding semantic segmentation maps based on those probabilities. The system then generates a fillable digital form using the reconstructed layout and the semantic segmentation maps.

    PERSONALIZED FORM ERROR CORRECTION PROPAGATION

    公开(公告)号:US20240362941A1

    公开(公告)日:2024-10-31

    申请号:US18140143

    申请日:2023-04-27

    Applicant: Adobe Inc.

    CPC classification number: G06V30/274 G06V30/1444 G06V30/19147 G06V30/414

    Abstract: A corrective noise system receives an electronic version of a fillable form generated by a segmentation network and receives a correction to a segmentation error in the electronic version of the fillable form. The corrective noise system is trained to generate noise that represents the correction and superimpose the noise on the fillable form. The corrective noise system is further trained to identify regions in a corpus of forms that are semantically similar to a region that was subject to the correction. The generated noise is propagated to the semantically similar regions in the corpus of forms and the noisy corpus of forms is provided as input to the segmentation network. The noise causes the segmentation network to accurately identify fillable regions in the corpus of forms and output a segmented version of the corpus of forms having improved fidelity without retraining or otherwise modifying the segmentation network.

    FORM STRUCTURE SIMILARITY DETECTION
    19.
    发明公开

    公开(公告)号:US20240330351A1

    公开(公告)日:2024-10-03

    申请号:US18190686

    申请日:2023-03-27

    Applicant: Adobe Inc.

    CPC classification number: G06F16/383 G06F16/332 G06V30/19147 G06V30/412

    Abstract: Form structure similarity detection techniques are described. A content processing system, for instance, receives a query snippet that depicts a query form structure. The content processing system generates a query layout string that includes semantic indicators to represent the query form structure and generates candidate layout strings that represent form structures from a target document. The content processing system calculates similarity scores between the query layout string and the candidate layout strings. Based on the similarity scores, the content processing system generates a target snippet for display that depicts a form structure that is structurally similar to the query form structure. The content processing system is further operable to generate a training dataset that includes image pairs of snippets depicting form structures that are structurally similar. The content processing system utilizes the training dataset to train a machine learning model to perform form structure similarity matching.

Patent Agency Ranking