Patent search ap:("ADOBE INC.") AND inv:"Bryan Russell" Page 1

1.

发明申请
LOCALIZATION OF NARRATIONS IN IMAGE DATA 有权

公开(公告)号：US20230115551A1

公开(公告)日：2023-04-13

申请号：US17499193

申请日：2021-10-12

Applicant: ADOBE INC.

Inventor： Hailin Jin , Bryan Russell , Reuben Xin Hong Tan

IPC: G06K9/00 , G06K9/62 , G10L15/26 , G10L15/19 , G10L15/16 , G10L15/02 , G06N3/04

Abstract: Methods, system, and computer storage media are provided for multi-modal localization. Input data comprising two modalities, such as image data and corresponding text or audio data, may be received. A phrase may be extracted from the text or audio data, and a neural network system may be utilized to spatially and temporally localize the phrase within the image data. The neural network system may include a plurality of cross-modal attention layers that each compare features across the first and second modalities without comparing features of the same modality. Using the cross-modal attention layers, a region or subset of pixels within one or more frames of the image data may be identified as corresponding to the phrase, and a localization indicator may be presented for display with the image data. Embodiments may also include unsupervised training of the neural network system.

2.

发明授权
Modifying neural networks for synthetic conditional digital content generation utilizing contrastive perceptual loss 有权

公开(公告)号：US11514632B2

公开(公告)日：2022-11-29

申请号：US17091440

申请日：2020-11-06

Applicant: Adobe Inc.

Inventor： Bryan Russell , Taesung Park , Richard Zhang , Junyan Zhu , Alexander Andonian

IPC: G09G5/00 , G06T11/60 , G06N3/04

Abstract: This disclosure describes methods, non-transitory computer readable storage media, and systems that utilize a contrastive perceptual loss to modify neural networks for generating synthetic digital content items. For example, the disclosed systems generate a synthetic digital content item based on a guide input to a generative neural network. The disclosed systems utilize an encoder neural network to generate encoded representations of the synthetic digital content item and a corresponding ground-truth digital content item. Additionally, the disclosed systems sample patches from the encoded representations of the encoded digital content items and then determine a contrastive loss based on the perceptual distances between the patches in the encoded representations. Furthermore, the disclosed systems jointly update the parameters of the generative neural network and the encoder neural network utilizing the contrastive loss.

3.

发明申请
RECONSTRUCTING THREE-DIMENSIONAL SCENES IN A TARGET COORDINATE SYSTEM FROM MULTIPLE VIEWS 有权

公开(公告)号：US20210295606A1

公开(公告)日：2021-09-23

申请号：US16822819

申请日：2020-03-18

Applicant: Adobe Inc.

Inventor： Vladimir Kim , Pierre-alain Langlois , Oliver Wang , Matthew Fisher , Bryan Russell

IPC: G06T19/20 , G06T9/00 , G06T17/20

Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for reconstructing three-dimensional meshes from two-dimensional images of objects with automatic coordinate system alignment. For example, the disclosed system can generate feature vectors for a plurality of images having different views of an object. The disclosed system can process the feature vectors to generate coordinate-aligned feature vectors aligned with a coordinate system associated with an image. The disclosed system can generate a combined feature vector from the feature vectors aligned to the coordinate system. Additionally, the disclosed system can then generate a three-dimensional mesh representing the object from the combined feature vector.

4.

发明授权
Reconstructing three-dimensional scenes using multi-view cycle projection 有权

公开(公告)号：US10937237B1

公开(公告)日：2021-03-02

申请号：US16816080

申请日：2020-03-11

Applicant: Adobe Inc.

Inventor： Vladimir Kim , Pierre-alain Langlois , Matthew Fisher , Bryan Russell , Oliver Wang

IPC: G06T17/20 , G06T7/73 , G06T7/77 , G06T19/20 , G06N3/04

Abstract: Methods, systems, and non-transitory computer readable storage media are disclosed for reconstructing three-dimensional object meshes from two-dimensional images of objects using multi-view cycle projection. For example, the disclosed system can determine a multi-view cycle projection loss across a plurality of images of an object via an estimated three-dimensional object mesh of the object. For example, the disclosed system uses a pixel mapping neural network to project a sampled pixel location across a plurality of images of an object and via a three-dimensional mesh representing the object. The disclosed system determines a multi-view cycle consistency loss based on a difference between the sampled pixel location and a cycle projection of the sampled pixel location and uses the loss to update the pixel mapping neural network, a latent vector representing the object, or a shape generation neural network that uses the latent vector to generate the object mesh.

5.

发明授权
Automatic 3D camera alignment and object arrangment to match a 2D background image 有权

公开(公告)号：US10417833B2

公开(公告)日：2019-09-17

申请号：US15804908

申请日：2017-11-06

Applicant: ADOBE INC.

Inventor： Jonathan Eisenmann , Geoffrey Alan Oxholm , Elya Shechtman , Bryan Russell

IPC: G06T19/20 , G06T15/20

Abstract: Embodiments disclosed herein provide systems, methods, and computer storage media for automatically aligning a 3D camera with a 2D background image. An automated image analysis can be performed on the 2D background image, and a classifier can predict whether the automated image analysis is accurate within a selected confidence level. As such, a feature can be enabled that allows a user to automatically align the 3D camera with the 2D background image. For example, where the automated analysis detects a horizon and one or more vanishing points from the background image, the 3D camera can be automatically transformed to align with the detected horizon and to point at a detected horizon-located vanishing point. In some embodiments, 3D objects in a 3D scene can be pivoted and the 3D camera dollied forward or backwards to reduce changes to the framing of the 3D composition resulting from the 3D camera transformation.

6.

发明授权
Localization of narrations in image data 有权

公开(公告)号：US12118787B2

公开(公告)日：2024-10-15

申请号：US17499193

申请日：2021-10-12

Applicant: ADOBE INC.

Inventor： Hailin Jin , Bryan Russell , Reuben Xin Hong Tan

IPC: G06K9/00 , G06F18/214 , G06F18/22 , G06N3/04 , G06V20/40 , G10L15/02 , G10L15/16 , G10L15/19 , G10L15/26

CPC classification number: G06V20/41 , G06F18/214 , G06F18/22 , G06N3/04 , G06V20/46 , G10L15/02 , G10L15/16 , G10L15/19 , G10L15/26

Abstract: Methods, system, and computer storage media are provided for multi-modal localization. Input data comprising two modalities, such as image data and corresponding text or audio data, may be received. A phrase may be extracted from the text or audio data, and a neural network system may be utilized to spatially and temporally localize the phrase within the image data. The neural network system may include a plurality of cross-modal attention layers that each compare features across the first and second modalities without comparing features of the same modality. Using the cross-modal attention layers, a region or subset of pixels within one or more frames of the image data may be identified as corresponding to the phrase, and a localization indicator may be presented for display with the image data. Embodiments may also include unsupervised training of the neural network system.

7.

发明授权
Representation learning from video with spatial audio 有权

公开(公告)号：US11308329B2

公开(公告)日：2022-04-19

申请号：US16868805

申请日：2020-05-07

Applicant: Adobe Inc.

Inventor： Justin Salamon , Bryan Russell , Karren Yang

IPC: G06K9/00 , H04S7/00 , G06K9/62

Abstract: A computer system is trained to understand audio-visual spatial correspondence using audio-visual clips having multi-channel audio. The computer system includes an audio subnetwork, video subnetwork, and pretext subnetwork. The audio subnetwork receives the two channels of audio from the audio-visual clips, and the video subnetwork receives the video frames from the audio-visual clips. In a subset of the audio-visual clips the audio-visual spatial relationship is misaligned, causing the audio-visual spatial cues for the audio and video to be incorrect. The audio subnetwork outputs an audio feature vector for each audio-visual clip, and the video subnetwork outputs a video feature vector for each audio-visual clip. The audio and video feature vectors for each audio-visual clip are merged and provided to the pretext subnetwork, which is configured to classify the merged vector as either having a misaligned audio-visual spatial relationship or not. The subnetworks are trained based on the loss calculated from the classification.

8.

发明授权
Motion model refinement based on contact analysis and optimization 有权

公开(公告)号：US11238634B2

公开(公告)日：2022-02-01

申请号：US16860411

申请日：2020-04-28

Applicant: Adobe Inc.

Inventor： Jimei Yang , Davis Rempe , Bryan Russell , Aaron Hertzmann

IPC: G06T13/40 , G06T15/20 , G06T15/50 , G06N3/08

Abstract: In some embodiments, a motion model refinement system receives an input video depicting a human character and an initial motion model describing motions of individual joint points of the human character in a three-dimensional space. The motion model refinement system identifies foot joint points of the human character that are in contact with a ground plane using a trained contact estimation model. The motion model refinement system determines the ground plane based on the foot joint points and the initial motion model and constructs an optimization problem for refining the initial motion model. The optimization problem minimizes the difference between the refined motion model and the initial motion model under a set of plausibility constraints including constraints on the contact foot joint points and a time-dependent inertia tensor-based constraint. The motion model refinement system obtains the refined motion model by solving the optimization problem.

9.

发明申请
TRANSCRIPT-BASED INSERTION OF SECONDARY VIDEO CONTENT INTO PRIMARY VIDEO CONTENT 审中-公开

公开(公告)号：US20200273493A1

公开(公告)日：2020-08-27

申请号：US16281903

申请日：2019-02-21

Applicant: Adobe Inc.

Inventor： Bernd Huber , Bryan Russell , Gautham Mysore , Hijung Valentina Shin , Oliver Wang

IPC: G11B27/036 , G11B27/11 , H04N9/87 , G06F3/0486 , G06F17/27 , G06F3/0482 , G06F16/74 , G06F16/732 , G06F16/738

Abstract: Certain embodiments involve transcript-based techniques for facilitating insertion of secondary video content into primary video content. For instance, a video editor presents a video editing interface having a primary video section displaying a primary video, a text-based navigation section having navigable portions of a primary video transcript, and a secondary video menu section displaying candidate secondary videos. In some embodiments, candidate secondary videos are obtained by using target terms detected in the transcript to query a remote data source for the candidate secondary videos. In embodiments involving video insertion, the video editor identifies a portion of the primary video corresponding to a portion of the transcript selected within the text-based navigation section. The video editor inserts a secondary video, which is selected from the candidate secondary videos based on an input received at the secondary video menu section, at the identified portion of the primary video.

10.

发明授权
Generating action tags for digital videos 有权

公开(公告)号：US11949964B2

公开(公告)日：2024-04-02

申请号：US17470441

申请日：2021-09-09

Applicant: Adobe Inc.

Inventor： Bryan Russell , Ruppesh Nalwaya , Markus Woodson , Joon-Young Lee , Hailin Jin

IPC: H04N21/81 , G06N3/08 , G06V20/40 , H04N21/845

CPC classification number: H04N21/8133 , G06N3/08 , G06V20/46 , H04N21/8456

Abstract: Systems, methods, and non-transitory computer-readable media are disclosed for automatic tagging of videos. In particular, in one or more embodiments, the disclosed systems generate a set of tagged feature vectors (e.g., tagged feature vectors based on action-rich digital videos) to utilize to generate tags for an input digital video. For instance, the disclosed systems can extract a set of frames for the input digital video and generate feature vectors from the set of frames. In some embodiments, the disclosed systems generate aggregated feature vectors from the feature vectors. Furthermore, the disclosed systems can utilize the feature vectors (or aggregated feature vectors) to identify similar tagged feature vectors from the set of tagged feature vectors. Additionally, the disclosed systems can generate a set of tags for the input digital videos by aggregating one or more tags corresponding to identified similar tagged feature vectors.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification