Patent search ap:("ADOBE INC.") AND inv:"Yang Zhou" Page 1

1.

发明申请
VIDEO GENERATION USING FRAME-WISE TOKEN EMBEDDINGS 有权

公开(公告)号：US20250119624A1

公开(公告)日：2025-04-10

申请号：US18894443

申请日：2024-09-24

Applicant: ADOBE INC.

Inventor： Seoung Wug Oh , Mingi Kwon , Joon-Young Lee , Yang Zhou , Difan Liu , Haoran Cai , Baqiao Liu , Feng Liu

IPC: H04N21/81

Abstract: A method, apparatus, non-transitory computer readable medium, and system for generating synthetic videos includes obtaining an input prompt describing a video scene. The embodiments then generate a plurality of frame-wise token embeddings corresponding to a sequence of video frames, respectively, based on the input prompt. Subsequently, embodiments generate, using a video generation model, a synthesized video depicting the video scene. The synthesized includes a plurality of images corresponding to the sequence of video frames.

2.

发明授权
Articulated part extraction from sprite sheets using machine learning 有权

公开(公告)号：US11875442B1

公开(公告)日：2024-01-16

申请号：US17829120

申请日：2022-05-31

Applicant: Adobe Inc. , University of Massachusetts

Inventor： Matthew David Fisher , Zhan Xu , Yang Zhou , Deepali Aneja , Evangelos Kalogerakis

IPC: G06T13/80 , G06V10/762 , G06T13/40 , G06V10/774 , G06T7/246 , G06V10/77

CPC classification number: G06T13/80 , G06T7/251 , G06T13/40 , G06V10/762 , G06V10/7715 , G06V10/7747

Abstract: Embodiments are disclosed for articulated part extraction using images of animated characters from sprite sheets by a digital design system. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including a plurality of images depicting an animated character in different poses. The disclosed systems and methods further comprise, for each pair of images in the plurality of images, determining, by a first machine learning model, pixel correspondences between pixels of the pair of images, and determining, by a second machine learning model, pixel clusters representing the animated character, each pixel cluster corresponding to a different structural segment of the animated character. The disclosed systems and methods further comprise selecting a subset of clusters that reconstructs the different poses of the animated character. The disclosed systems and methods further comprise creating a rigged animated character based on the selected subset of clusters.

3.

发明申请
STYLE-AWARE AUDIO-DRIVEN TALKING HEAD ANIMATION FROM A SINGLE IMAGE 有权

公开(公告)号：US20220392131A1

公开(公告)日：2022-12-08

申请号：US17887685

申请日：2022-08-15

Applicant: Adobe Inc.

Inventor： Dingzeyu Li , Yang Zhou , Jose Ignacio Echevarria Vallespi , Elya Shechtman

IPC: G06T13/20 , G06T17/20 , G06T13/40

Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for generating an animation of a talking head from an input audio signal of speech and a representation (such as a static image) of a head to animate. Generally, a neural network can learn to predict a set of 3D facial landmarks that can be used to drive the animation. In some embodiments, the neural network can learn to detect different speaking styles in the input speech and account for the different speaking styles when predicting the 3D facial landmarks. Generally, template 3D facial landmarks can be identified or extracted from the input image or other representation of the head, and the template 3D facial landmarks can be used with successive windows of audio from the input speech to predict 3D facial landmarks and generate a corresponding animation with plausible 3D effects.

4.

发明授权
Utilizing generative models for resynthesis of transition frames in clipped digital videos 有权

公开(公告)号：US12192593B2

公开(公告)日：2025-01-07

申请号：US18164348

申请日：2023-02-03

Applicant: Adobe Inc.

Inventor： Xiaojuan Wang , Richard Zhang , Taesung Park , Yang Zhou , Elya Shechtman

IPC: H04N21/234 , G06V10/771 , G06V10/82 , H04N21/81

Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize machine learning to generate a sequence of transition frames for a gap in a clipped digital video. For example, the disclosed system receives a clipped digital video that includes a pre-cut frame prior to a gap in the clipped digital video and a post-cut frame following the gap in the clipped digital video. Moreover, the disclosed system utilizes a natural motion sequence model to generates a sequence of transition keypoint maps between the pre-cut frame and the post-cut frame. Furthermore, using a generative neural network, the disclosed system generates a sequence of transition frames for the gap in the clipped digital video from the sequence of transition keypoint maps.

5.

发明公开
UTILIZING GENERATIVE MODELS FOR RESYNTHESIS OF TRANSITION FRAMES IN CLIPPED DIGITAL VIDEOS 审中-公开

公开(公告)号：US20240267597A1

公开(公告)日：2024-08-08

申请号：US18164348

申请日：2023-02-03

Applicant: Adobe Inc.

Inventor： Xiaojuan Wang , Richard Zhang , Taesung Park , Yang Zhou , Elya Shechtman

IPC: H04N21/81 , G06V10/771 , G06V10/82 , H04N21/234

CPC classification number: H04N21/8153 , G06V10/771 , G06V10/82 , H04N21/23424

Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize machine learning to generate a sequence of transition frames for a gap in a clipped digital video. For example, the disclosed system receives a clipped digital video that includes a pre-cut frame prior to a gap in the clipped digital video and a post-cut frame following the gap in the clipped digital video. Moreover, the disclosed system utilizes a natural motion sequence model to generates a sequence of transition keypoint maps between the pre-cut frame and the post-cut frame. Furthermore, using a generative neural network, the disclosed system generates a sequence of transition frames for the gap in the clipped digital video from the sequence of transition keypoint maps.

6.

发明授权
Re-timing a video sequence to an audio sequence based on motion and audio beat detection 有权

公开(公告)号：US11682238B2

公开(公告)日：2023-06-20

申请号：US17175441

申请日：2021-02-12

Applicant: Adobe Inc.

Inventor： Jimei Yang , Deepali Aneja , Dingzeyu Li , Jun Saito , Yang Zhou

IPC: G06V40/20 , G06T7/215 , G06V20/40 , G06V40/10 , H04N5/06 , H04N21/8547 , G11B27/031 , G10H1/36 , G11B27/10 , H04N21/845

CPC classification number: G06V40/23 , G06T7/215 , G06V20/41 , G06V20/46 , G06V40/103 , H04N5/06 , H04N21/8456 , H04N21/8547

Abstract: Embodiments are disclosed for re-timing a video sequence to an audio sequence based on the detection of motion beats in the video sequence and audio beats in the audio sequence. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a first input, the first input including a video sequence, detecting motion beats in the video sequence, receiving a second input, the second input including an audio sequence, detecting audio beats in the audio sequence, modifying the video sequence by matching the detected motions beats in the video sequence to the detected audio beats in the audio sequence, and outputting the modified video sequence.

7.

发明公开
MODIFYING POSES OF TWO-DIMENSIONAL HUMANS IN TWO-DIMENSIONAL IMAGES BY REPOSING THREE-DIMENSIONAL HUMAN MODELS REPRESENTING THE TWO-DIMENSIONAL HUMANS 审中-公开

公开(公告)号：US20240144623A1

公开(公告)日：2024-05-02

申请号：US18304147

申请日：2023-04-20

Applicant: Adobe Inc.

Inventor： Giorgio Gori , Yi Zhou , Yangtuanfeng Wang , Yang Zhou , Krishna Kumar Singh , Jae Shin Yoon , Duygu Ceylan Aksit

IPC: G06T19/20 , G06T7/70 , G06T15/00 , G06T17/00

CPC classification number: G06T19/20 , G06T7/70 , G06T15/00 , G06T17/00 , G06T2200/24 , G06T2207/20084 , G06T2207/30196 , G06T2207/30244 , G06T2219/2004

Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that modify two-dimensional images via scene-based editing using three-dimensional representations of the two-dimensional images. For instance, in one or more embodiments, the disclosed systems utilize three-dimensional representations of two-dimensional images to generate and modify shadows in the two-dimensional images according to various shadow maps. Additionally, the disclosed systems utilize three-dimensional representations of two-dimensional images to modify humans in the two-dimensional images. The disclosed systems also utilize three-dimensional representations of two-dimensional images to provide scene scale estimation via scale fields of the two-dimensional images. In some embodiments, the disclosed systems utilizes three-dimensional representations of two-dimensional images to generate and visualize 3D planar surfaces for modifying objects in two-dimensional images. The disclosed systems further use three-dimensional representations of two-dimensional images to customize focal points for the two-dimensional images.

8.

发明授权
Style-aware audio-driven talking head animation from a single image 有权

公开(公告)号：US11417041B2

公开(公告)日：2022-08-16

申请号：US16788551

申请日：2020-02-12

Applicant: ADOBE INC.

Inventor： Dingzeyu Li , Yang Zhou , Jose Ignacio Echevarria Vallespi , Elya Shechtman

IPC: G06T13/20 , G06T13/40 , G06T17/20

Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for generating an animation of a talking head from an input audio signal of speech and a representation (such as a static image) of a head to animate. Generally, a neural network can learn to predict a set of 3D facial landmarks that can be used to drive the animation. In some embodiments, the neural network can learn to detect different speaking styles in the input speech and account for the different speaking styles when predicting the 3D facial landmarks. Generally, template 3D facial landmarks can be identified or extracted from the input image or other representation of the head, and the template 3D facial landmarks can be used with successive windows of audio from the input speech to predict 3D facial landmarks and generate a corresponding animation with plausible 3D effects.

9.

发明授权
Style-aware audio-driven talking head animation from a single image 有权

公开(公告)号：US11776188B2

公开(公告)日：2023-10-03

申请号：US17887685

申请日：2022-08-15

Applicant: Adobe Inc.

Inventor： Dingzeyu Li , Yang Zhou , Jose Ignacio Echevarria Vallespi , Elya Shechtman

IPC: G06T13/20 , G06T17/20 , G06T13/40

CPC classification number: G06T13/205 , G06T13/40 , G06T17/20

Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for generating an animation of a talking head from an input audio signal of speech and a representation (such as a static image) of a head to animate. Generally, a neural network can learn to predict a set of 3D facial landmarks that can be used to drive the animation. In some embodiments, the neural network can learn to detect different speaking styles in the input speech and account for the different speaking styles when predicting the 3D facial landmarks. Generally, template 3D facial landmarks can be identified or extracted from the input image or other representation of the head, and the template 3D facial landmarks can be used with successive windows of audio from the input speech to predict 3D facial landmarks and generate a corresponding animation with plausible 3D effects.

10.

发明公开
GENERATING THREE-DIMENSIONAL HUMAN MODELS REPRESENTING TWO-DIMENSIONAL HUMANS IN TWO-DIMENSIONAL IMAGES 审中-公开

公开(公告)号：US20240144520A1

公开(公告)日：2024-05-02

申请号：US18304144

申请日：2023-04-20

Applicant: Adobe Inc.

Inventor： Giorgio Gori , Yi Zhou , Yangtuanfeng Wang , Yang Zhou , Krishna Kumar Singh , Jae Shin Yoon , Duygu Ceylan Aksit

IPC: G06T7/73

CPC classification number: G06T7/73 , G06T2207/20084 , G06T2207/30196

Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that modify two-dimensional images via scene-based editing using three-dimensional representations of the two-dimensional images. For instance, in one or more embodiments, the disclosed systems utilize three-dimensional representations of two-dimensional images to generate and modify shadows in the two-dimensional images according to various shadow maps. Additionally, the disclosed systems utilize three-dimensional representations of two-dimensional images to modify humans in the two-dimensional images. The disclosed systems also utilize three-dimensional representations of two-dimensional images to provide scene scale estimation via scale fields of the two-dimensional images. In some embodiments, the disclosed systems utilizes three-dimensional representations of two-dimensional images to generate and visualize 3D planar surfaces for modifying objects in two-dimensional images. The disclosed systems further use three-dimensional representations of two-dimensional images to customize focal points for the two-dimensional images.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification