-
公开(公告)号:US12061668B2
公开(公告)日:2024-08-13
申请号:US17466636
申请日:2021-09-03
Applicant: ADOBE INC.
Inventor: Simon Jenni , Hailin Jin
IPC: G06F18/213 , G06F18/214 , G06F18/2413 , G06N3/045 , G06N3/08
CPC classification number: G06F18/213 , G06F18/214 , G06F18/2413 , G06N3/045 , G06N3/08
Abstract: The disclosed invention includes systems and methods for training and employing equivariant models for generating representations (e.g., vector representations) of temporally-varying content, such as but not limited to video content. The trained models are equivariant to temporal transformations applied to the input content (e.g., video content). The trained models are additionally invariant to non-temporal transformations (e.g., spatial and/or color-space transformations) applied to the input content. Such representations are employed in various machine learning tasks, such as but not limited to video retrieval (e.g., video search engine applications), identification of actions depicted in video, and temporally ordering clips of the video.
-
公开(公告)号:US20240419726A1
公开(公告)日:2024-12-19
申请号:US18210535
申请日:2023-06-15
Applicant: Adobe Inc.
Inventor: Simon Jenni , Fabian David Caba Heilbron , Chun-Hsiao Yeh , Bryan Russell , Josef Sivic
IPC: G06F16/58 , G06F16/535 , G06F16/538
Abstract: Techniques for learning to personalize vision-language models through meta-personalization are described. In one embodiment, one or more processing devices lock a pre-trained vision-language model (VLM) during a training phase. The processing devices train the pre-trained VLM to augment a text encoder of the pre-trained VLM with a set of general named video instances to form a meta-personalized VLM, the meta-personalized VLM to include global category features. The processing devices test the meta-personalized VLM to adapt the text encoder with a set of personal named video instances to form a personal VLM, the personal VLM comprising the global category features personalized with a set of personal instance weights to form a personal instance token associated with the user. Other embodiments are described and claimed.
-
公开(公告)号:US20230075087A1
公开(公告)日:2023-03-09
申请号:US17466636
申请日:2021-09-03
Applicant: ADOBE INC.
Inventor: Simon Jenni , Hailin Jin
Abstract: The disclosed invention includes systems and methods for training and employing equivariant models for generating representations (e.g., vector representations) of temporally-varying content, such as but not limited to video content. The trained models are equivariant to temporal transformations applied to the input content (e.g., video content). The trained models are additionally invariant to non-temporal transformations (e.g., spatial and/or color-space transformations) applied to the input content. Such representations are employed in various machine learning tasks, such as but not limited to video retrieval (e.g., video search engine applications), identification of actions depicted in video, and temporally ordering clips of the video.
-
公开(公告)号:US12112771B2
公开(公告)日:2024-10-08
申请号:US18185137
申请日:2023-03-16
Applicant: Adobe Inc.
Inventor: Simon Jenni , Markus Woodson , Fabian David Caba Heilbron
IPC: G11B27/00 , H04N21/234 , H04N21/2343 , H04N21/24
CPC classification number: G11B27/005 , H04N21/23418 , H04N21/234381 , H04N21/2402
Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that generate a temporally remapped video that satisfies a desired target duration while preserving natural video dynamics. In certain instances, the disclosed systems utilize a playback speed prediction machine-learning model that recognizes and localizes temporally varying changes in video playback speed to re-time a digital video with varying frame-change speeds. For instance, to re-time the digital video, the disclosed systems utilize the playback speed prediction machine-learning model to infer the slowness of individual video frames. Subsequently, in certain embodiments, the disclosed systems determine, from frames of a digital video, a temporal frame sub-sampling that is consistent with the slowness predictions and fit within a target video duration. In certain implementations, the disclosed systems utilize the temporal frame sub-sampling to generate a speed varying digital video that preserves natural video dynamics while fitting the target video duration.
-
公开(公告)号:US12081827B2
公开(公告)日:2024-09-03
申请号:US17822573
申请日:2022-08-26
Applicant: Adobe Inc. , University of Surrey
Inventor: Alexander Black , Van Tu Bui , John Collomosse , Simon Jenni , Viswanathan Swaminathan
IPC: H04N21/434 , G06F16/732 , G06F16/78 , H04N21/84 , H04N21/845
CPC classification number: H04N21/4341 , G06F16/732 , G06F16/7867 , H04N21/84 , H04N21/8456
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize deep learning to map query videos to known videos so as to identify a provenance of the query video or identify editorial manipulations of the query video relative to a known video. For example, the video comparison system includes a deep video comparator model that generates and compares visual and audio descriptors utilizing codewords and an inverse index. The deep video comparator model is robust and ignores discrepancies due to benign transformations that commonly occur during electronic video distribution.
-
公开(公告)号:US20240073478A1
公开(公告)日:2024-02-29
申请号:US17822573
申请日:2022-08-26
Applicant: Adobe Inc. , University of Surrey
Inventor: Alexander Black , Van Tu Bui , John Collomosse , Simon Jenni , Viswanathan Swaminathan
IPC: H04N21/434 , G06F16/732 , G06F16/78 , H04N21/84 , H04N21/845
CPC classification number: H04N21/4341 , G06F16/732 , G06F16/7867 , H04N21/84 , H04N21/8456
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize deep learning to map query videos to known videos so as to identify a provenance of the query video or identify editorial manipulations of the query video relative to a known video. For example, the video comparison system includes a deep video comparator model that generates and compares visual and audio descriptors utilizing codewords and an inverse index. The deep video comparator model is robust and ignores discrepancies due to benign transformations that commonly occur during electronic video distribution.
-
公开(公告)号:US20240430515A1
公开(公告)日:2024-12-26
申请号:US18822424
申请日:2024-09-02
Applicant: Adobe, Inc. , University of Surrey
Inventor: Alexander Black , Van Tu Bui , John Collomosse , Simon Jenni , Viswanathan Swaminathan
IPC: H04N21/434 , G06F16/732 , G06F16/78 , H04N21/84 , H04N21/845
Abstract: The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize deep learning to map query videos to known videos so as to identify a provenance of the query video or identify editorial manipulations of the query video relative to a known video. For example, the video comparison system includes a deep video comparator model that generates and compares visual and audio descriptors utilizing codewords and an inverse index. The deep video comparator model is robust and ignores discrepancies due to benign transformations that commonly occur during electronic video distribution.
-
公开(公告)号:US20230276084A1
公开(公告)日:2023-08-31
申请号:US18185137
申请日:2023-03-16
Applicant: Adobe Inc.
Inventor: Simon Jenni , Markus Woodson , Fabian David Caba Heilbron
IPC: H04N21/2343 , H04N21/234 , H04N21/24
CPC classification number: H04N21/234381 , H04N21/23418 , H04N21/2402
Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that generate a temporally remapped video that satisfies a desired target duration while preserving natural video dynamics. In certain instances, the disclosed systems utilize a playback speed prediction machine-learning model that recognizes and localizes temporally varying changes in video playback speed to re-time a digital video with varying frame-change speeds. For instance, to re-time the digital video, the disclosed systems utilize the playback speed prediction machine-learning model to infer the slowness of individual video frames. Subsequently, in certain embodiments, the disclosed systems determine, from frames of a digital video, a temporal frame sub-sampling that is consistent with the slowness predictions and fit within a target video duration. In certain implementations, the disclosed systems utilize the temporal frame sub-sampling to generate a speed varying digital video that preserves natural video dynamics while fitting the target video duration.
-
公开(公告)号:US11610606B1
公开(公告)日:2023-03-21
申请号:US17652586
申请日:2022-02-25
Applicant: Adobe Inc.
Inventor: Simon Jenni , Markus Woodson , Fabian David Caba Heilbron
Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that generate a temporally remapped video that satisfies a desired target duration while preserving natural video dynamics. In certain instances, the disclosed systems utilize a playback speed prediction machine-learning model that recognizes and localizes temporally varying changes in video playback speed to re-time a digital video with varying frame-change speeds. For instance, to re-time the digital video, the disclosed systems utilize the playback speed prediction machine-learning model to infer the slowness of individual video frames. Subsequently, in certain embodiments, the disclosed systems determine, from frames of a digital video, a temporal frame sub-sampling that is consistent with the slowness predictions and fit within a target video duration. In certain implementations, the disclosed systems utilize the temporal frame sub-sampling to generate a speed varying digital video that preserves natural video dynamics while fitting the target video duration.
-
-
-
-
-
-
-
-