TEMPORALLY DISTRIBUTED NEURAL NETWORKS FOR VIDEO SEMANTIC SEGMENTATION

    公开(公告)号:US20220270370A1

    公开(公告)日:2022-08-25

    申请号:US17735156

    申请日:2022-05-03

    Applicant: Adobe Inc.

    Abstract: A Video Semantic Segmentation System (VSSS) is disclosed that performs accurate and fast semantic segmentation of videos using a set of temporally distributed neural networks. The VSSS receives as input a video signal comprising a contiguous sequence of temporally-related video frames. The VSSS extracts features from the video frames in the contiguous sequence and based upon the extracted features, selects, from a set of labels, a label to be associated with each pixel of each video frame in the video signal. In certain embodiments, a set of multiple neural networks are used to extract the features to be used for video segmentation and the extraction of features is distributed among the multiple neural networks in the set. A strong feature representation representing the entirety of the features is produced for each video frame in the sequence of video frames by aggregating the output features extracted by the multiple neural networks.

    GENERATING REFINED ALPHA MATTES UTILIZING GUIDANCE MASKS AND A PROGRESSIVE REFINEMENT NETWORK

    公开(公告)号:US20220262009A1

    公开(公告)日:2022-08-18

    申请号:US17177595

    申请日:2021-02-17

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to systems, non-transitory computer-readable media, and methods that utilize a progressive refinement network to refine alpha mattes generated utilizing a mask-guided matting neural network. In particular, the disclosed systems can use the matting neural network to process a digital image and a coarse guidance mask to generate alpha mattes at discrete neural network layers. In turn, the disclosed systems can use the progressive refinement network to combine alpha mattes and refine areas of uncertainty. For example, the progressive refinement network can combine a core alpha matte corresponding to more certain core regions of a first alpha matte and a boundary alpha matte corresponding to uncertain boundary regions of a second, higher resolution alpha matte. Based on the combination of the core alpha matte and the boundary alpha matte, the disclosed systems can generate a final alpha matte for use in image matting processes.

    Generating scene graphs from digital images using external knowledge and image reconstruction

    公开(公告)号:US11373390B2

    公开(公告)日:2022-06-28

    申请号:US16448473

    申请日:2019-06-21

    Applicant: Adobe Inc.

    Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for generating semantic scene graphs for digital images using an external knowledgebase for feature refinement. For example, the disclosed system can determine object proposals and subgraph proposals for a digital image to indicate candidate relationships between objects in the digital image. The disclosed system can then extract relationships from an external knowledgebase for refining features of the object proposals and the subgraph proposals. Additionally, the disclosed system can generate a semantic scene graph for the digital image based on the refined features of the object/subgraph proposals. Furthermore, the disclosed system can update/train a semantic scene graph generation network based on the generated semantic scene graph. The disclosed system can also reconstruct the image using object labels based on the refined features to further update/train the semantic scene graph generation network.

    Edge-guided ranking loss for monocular depth prediction

    公开(公告)号:US11367206B2

    公开(公告)日:2022-06-21

    申请号:US16790056

    申请日:2020-02-13

    Applicant: Adobe Inc.

    Abstract: In order to provide monocular depth prediction, a trained neural network may be used. To train the neural network, edge detection on a digital image may be performed to determine at least one edge of the digital image, and then a first point and a second point of the digital image may be sampled, based on the at least one edge. A relative depth between the first point and the second point may be predicted, and the neural network may be trained to perform monocular depth prediction using a loss function that compares the predicted relative depth with a ground truth relative depth between the first point and the second point.

    Object Detection In Images
    265.
    发明申请

    公开(公告)号:US20220157054A1

    公开(公告)日:2022-05-19

    申请号:US17588516

    申请日:2022-01-31

    Applicant: Adobe Inc.

    Abstract: In implementations of object detection in images, object detectors are trained using heterogeneous training datasets. A first training dataset is used to train an image tagging network to determine an attention map of an input image for a target concept. A second training dataset is used to train a conditional detection network that accepts as conditional inputs the attention map and a word embedding of the target concept. Despite the conditional detection network being trained with a training dataset having a small number of seen classes (e.g., classes in a training dataset), it generalizes to novel, unseen classes by concept conditioning, since the target concept propagates through the conditional detection network via the conditional inputs, thus influencing classification and region proposal. Hence, classes of objects that can be detected are expanded, without the need to scale training databases to include additional classes.

    PROPAGATING MULTI-TERM CONTEXTUAL TAGS TO DIGITAL CONTENT

    公开(公告)号:US20220100791A1

    公开(公告)日:2022-03-31

    申请号:US17544689

    申请日:2021-12-07

    Applicant: Adobe Inc.

    Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for determining multi-term contextual tags for digital content and propagating the multi-term contextual tags to additional digital content. For instance, the disclosed systems can utilize search query supervision to determine and associate multi-term contextual tags (e.g., tags that represent a specific concept based on the order of the terms in the tag) with digital content. Furthermore, the disclosed systems can propagate the multi-term contextual tags determined for the digital content to additional digital content based on similarities between the digital content and additional digital content (e.g., utilizing clustering techniques). Additionally, the disclosed systems can provide digital content as search results based on the associated multi-term contextual tags.

    Multi-object image parsing using neural network pipeline

    公开(公告)号:US11238593B2

    公开(公告)日:2022-02-01

    申请号:US16789088

    申请日:2020-02-12

    Applicant: Adobe Inc.

    Abstract: Techniques are disclosed for parsing a source image, to identify segments of one or more objects within the source image. The parsing is carried out by an image parsing pipeline that includes three distinct stages comprising three respectively neural network models. The source image can include one or more objects. A first neural network model of the pipeline identifies a section of the source image that includes the object comprising a plurality of segments. A second neural network model of the pipeline generates, from the section of the source image, a mask image, where the mask image identifies one or more segments of the object. A third neural network model of the pipeline further refines the identification of the segments in the mask image, to generate a parsed image. The parsed image identifies the segments of the object, by assigning corresponding unique labels to pixels of different segments of the object.

    Identifying visually similar digital images utilizing deep learning

    公开(公告)号:US11227185B2

    公开(公告)日:2022-01-18

    申请号:US16817234

    申请日:2020-03-12

    Applicant: ADOBE INC.

    Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media for utilizing a deep neural network-based model to identify similar digital images for query digital images. For example, the disclosed systems utilize a deep neural network-based model to analyze query digital images to generate deep neural network-based representations of the query digital images. In addition, the disclosed systems can generate results of visually-similar digital images for the query digital images based on comparing the deep neural network-based representations with representations of candidate digital images. Furthermore, the disclosed systems can identify visually similar digital images based on user-defined attributes and image masks to emphasize specific attributes or portions of query digital images.

    Utilizing a neural network having a two-stream encoder architecture to generate composite digital images

    公开(公告)号:US11158055B2

    公开(公告)日:2021-10-26

    申请号:US16523465

    申请日:2019-07-26

    Applicant: Adobe Inc.

    Abstract: The present disclosure relates to utilizing a neural network having a two-stream encoder architecture to accurately generate composite digital images that realistically portray a foreground object from one digital image against a scene from another digital image. For example, the disclosed systems can utilize a foreground encoder of the neural network to identify features from a foreground image and further utilize a background encoder to identify features from a background image. The disclosed systems can then utilize a decoder to fuse the features together and generate a composite digital image. The disclosed systems can train the neural network utilizing an easy-to-hard data augmentation scheme implemented via self-teaching. The disclosed systems can further incorporate the neural network within an end-to-end framework for automation of the image composition process.

Patent Agency Ranking