-
公开(公告)号:US20250119624A1
公开(公告)日:2025-04-10
申请号:US18894443
申请日:2024-09-24
Applicant: ADOBE INC.
Inventor: Seoung Wug Oh , Mingi Kwon , Joon-Young Lee , Yang Zhou , Difan Liu , Haoran Cai , Baqiao Liu , Feng Liu
IPC: H04N21/81
Abstract: A method, apparatus, non-transitory computer readable medium, and system for generating synthetic videos includes obtaining an input prompt describing a video scene. The embodiments then generate a plurality of frame-wise token embeddings corresponding to a sequence of video frames, respectively, based on the input prompt. Subsequently, embodiments generate, using a video generation model, a synthesized video depicting the video scene. The synthesized includes a plurality of images corresponding to the sequence of video frames.
-
公开(公告)号:US11875442B1
公开(公告)日:2024-01-16
申请号:US17829120
申请日:2022-05-31
Applicant: Adobe Inc. , University of Massachusetts
Inventor: Matthew David Fisher , Zhan Xu , Yang Zhou , Deepali Aneja , Evangelos Kalogerakis
IPC: G06T13/80 , G06V10/762 , G06T13/40 , G06V10/774 , G06T7/246 , G06V10/77
CPC classification number: G06T13/80 , G06T7/251 , G06T13/40 , G06V10/762 , G06V10/7715 , G06V10/7747
Abstract: Embodiments are disclosed for articulated part extraction using images of animated characters from sprite sheets by a digital design system. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including a plurality of images depicting an animated character in different poses. The disclosed systems and methods further comprise, for each pair of images in the plurality of images, determining, by a first machine learning model, pixel correspondences between pixels of the pair of images, and determining, by a second machine learning model, pixel clusters representing the animated character, each pixel cluster corresponding to a different structural segment of the animated character. The disclosed systems and methods further comprise selecting a subset of clusters that reconstructs the different poses of the animated character. The disclosed systems and methods further comprise creating a rigged animated character based on the selected subset of clusters.
-
公开(公告)号:US20220392131A1
公开(公告)日:2022-12-08
申请号:US17887685
申请日:2022-08-15
Applicant: Adobe Inc.
Inventor: Dingzeyu Li , Yang Zhou , Jose Ignacio Echevarria Vallespi , Elya Shechtman
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for generating an animation of a talking head from an input audio signal of speech and a representation (such as a static image) of a head to animate. Generally, a neural network can learn to predict a set of 3D facial landmarks that can be used to drive the animation. In some embodiments, the neural network can learn to detect different speaking styles in the input speech and account for the different speaking styles when predicting the 3D facial landmarks. Generally, template 3D facial landmarks can be identified or extracted from the input image or other representation of the head, and the template 3D facial landmarks can be used with successive windows of audio from the input speech to predict 3D facial landmarks and generate a corresponding animation with plausible 3D effects.
-
4.
公开(公告)号:US12192593B2
公开(公告)日:2025-01-07
申请号:US18164348
申请日:2023-02-03
Applicant: Adobe Inc.
Inventor: Xiaojuan Wang , Richard Zhang , Taesung Park , Yang Zhou , Elya Shechtman
IPC: H04N21/234 , G06V10/771 , G06V10/82 , H04N21/81
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize machine learning to generate a sequence of transition frames for a gap in a clipped digital video. For example, the disclosed system receives a clipped digital video that includes a pre-cut frame prior to a gap in the clipped digital video and a post-cut frame following the gap in the clipped digital video. Moreover, the disclosed system utilizes a natural motion sequence model to generates a sequence of transition keypoint maps between the pre-cut frame and the post-cut frame. Furthermore, using a generative neural network, the disclosed system generates a sequence of transition frames for the gap in the clipped digital video from the sequence of transition keypoint maps.
-
5.
公开(公告)号:US20240267597A1
公开(公告)日:2024-08-08
申请号:US18164348
申请日:2023-02-03
Applicant: Adobe Inc.
Inventor: Xiaojuan Wang , Richard Zhang , Taesung Park , Yang Zhou , Elya Shechtman
IPC: H04N21/81 , G06V10/771 , G06V10/82 , H04N21/234
CPC classification number: H04N21/8153 , G06V10/771 , G06V10/82 , H04N21/23424
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize machine learning to generate a sequence of transition frames for a gap in a clipped digital video. For example, the disclosed system receives a clipped digital video that includes a pre-cut frame prior to a gap in the clipped digital video and a post-cut frame following the gap in the clipped digital video. Moreover, the disclosed system utilizes a natural motion sequence model to generates a sequence of transition keypoint maps between the pre-cut frame and the post-cut frame. Furthermore, using a generative neural network, the disclosed system generates a sequence of transition frames for the gap in the clipped digital video from the sequence of transition keypoint maps.
-
公开(公告)号:US11682238B2
公开(公告)日:2023-06-20
申请号:US17175441
申请日:2021-02-12
Applicant: Adobe Inc.
Inventor: Jimei Yang , Deepali Aneja , Dingzeyu Li , Jun Saito , Yang Zhou
IPC: G06V40/20 , G06T7/215 , G06V20/40 , G06V40/10 , H04N5/06 , H04N21/8547 , G11B27/031 , G10H1/36 , G11B27/10 , H04N21/845
CPC classification number: G06V40/23 , G06T7/215 , G06V20/41 , G06V20/46 , G06V40/103 , H04N5/06 , H04N21/8456 , H04N21/8547
Abstract: Embodiments are disclosed for re-timing a video sequence to an audio sequence based on the detection of motion beats in the video sequence and audio beats in the audio sequence. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a first input, the first input including a video sequence, detecting motion beats in the video sequence, receiving a second input, the second input including an audio sequence, detecting audio beats in the audio sequence, modifying the video sequence by matching the detected motions beats in the video sequence to the detected audio beats in the audio sequence, and outputting the modified video sequence.
-
公开(公告)号:US20240144623A1
公开(公告)日:2024-05-02
申请号:US18304147
申请日:2023-04-20
Applicant: Adobe Inc.
Inventor: Giorgio Gori , Yi Zhou , Yangtuanfeng Wang , Yang Zhou , Krishna Kumar Singh , Jae Shin Yoon , Duygu Ceylan Aksit
CPC classification number: G06T19/20 , G06T7/70 , G06T15/00 , G06T17/00 , G06T2200/24 , G06T2207/20084 , G06T2207/30196 , G06T2207/30244 , G06T2219/2004
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that modify two-dimensional images via scene-based editing using three-dimensional representations of the two-dimensional images. For instance, in one or more embodiments, the disclosed systems utilize three-dimensional representations of two-dimensional images to generate and modify shadows in the two-dimensional images according to various shadow maps. Additionally, the disclosed systems utilize three-dimensional representations of two-dimensional images to modify humans in the two-dimensional images. The disclosed systems also utilize three-dimensional representations of two-dimensional images to provide scene scale estimation via scale fields of the two-dimensional images. In some embodiments, the disclosed systems utilizes three-dimensional representations of two-dimensional images to generate and visualize 3D planar surfaces for modifying objects in two-dimensional images. The disclosed systems further use three-dimensional representations of two-dimensional images to customize focal points for the two-dimensional images.
-
公开(公告)号:US11417041B2
公开(公告)日:2022-08-16
申请号:US16788551
申请日:2020-02-12
Applicant: ADOBE INC.
Inventor: Dingzeyu Li , Yang Zhou , Jose Ignacio Echevarria Vallespi , Elya Shechtman
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for generating an animation of a talking head from an input audio signal of speech and a representation (such as a static image) of a head to animate. Generally, a neural network can learn to predict a set of 3D facial landmarks that can be used to drive the animation. In some embodiments, the neural network can learn to detect different speaking styles in the input speech and account for the different speaking styles when predicting the 3D facial landmarks. Generally, template 3D facial landmarks can be identified or extracted from the input image or other representation of the head, and the template 3D facial landmarks can be used with successive windows of audio from the input speech to predict 3D facial landmarks and generate a corresponding animation with plausible 3D effects.
-
公开(公告)号:US11776188B2
公开(公告)日:2023-10-03
申请号:US17887685
申请日:2022-08-15
Applicant: Adobe Inc.
Inventor: Dingzeyu Li , Yang Zhou , Jose Ignacio Echevarria Vallespi , Elya Shechtman
CPC classification number: G06T13/205 , G06T13/40 , G06T17/20
Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for generating an animation of a talking head from an input audio signal of speech and a representation (such as a static image) of a head to animate. Generally, a neural network can learn to predict a set of 3D facial landmarks that can be used to drive the animation. In some embodiments, the neural network can learn to detect different speaking styles in the input speech and account for the different speaking styles when predicting the 3D facial landmarks. Generally, template 3D facial landmarks can be identified or extracted from the input image or other representation of the head, and the template 3D facial landmarks can be used with successive windows of audio from the input speech to predict 3D facial landmarks and generate a corresponding animation with plausible 3D effects.
-
10.
公开(公告)号:US20240144520A1
公开(公告)日:2024-05-02
申请号:US18304144
申请日:2023-04-20
Applicant: Adobe Inc.
Inventor: Giorgio Gori , Yi Zhou , Yangtuanfeng Wang , Yang Zhou , Krishna Kumar Singh , Jae Shin Yoon , Duygu Ceylan Aksit
IPC: G06T7/73
CPC classification number: G06T7/73 , G06T2207/20084 , G06T2207/30196
Abstract: The present disclosure relates to systems, methods, and non-transitory computer-readable media that modify two-dimensional images via scene-based editing using three-dimensional representations of the two-dimensional images. For instance, in one or more embodiments, the disclosed systems utilize three-dimensional representations of two-dimensional images to generate and modify shadows in the two-dimensional images according to various shadow maps. Additionally, the disclosed systems utilize three-dimensional representations of two-dimensional images to modify humans in the two-dimensional images. The disclosed systems also utilize three-dimensional representations of two-dimensional images to provide scene scale estimation via scale fields of the two-dimensional images. In some embodiments, the disclosed systems utilizes three-dimensional representations of two-dimensional images to generate and visualize 3D planar surfaces for modifying objects in two-dimensional images. The disclosed systems further use three-dimensional representations of two-dimensional images to customize focal points for the two-dimensional images.
-
-
-
-
-
-
-
-
-