Equivariant models for generating vector representations of temporally-varying content

    公开(公告)号:US12061668B2

    公开(公告)日:2024-08-13

    申请号:US17466636

    申请日:2021-09-03

    Applicant: ADOBE INC.

    CPC classification number: G06F18/213 G06F18/214 G06F18/2413 G06N3/045 G06N3/08

    Abstract: The disclosed invention includes systems and methods for training and employing equivariant models for generating representations (e.g., vector representations) of temporally-varying content, such as but not limited to video content. The trained models are equivariant to temporal transformations applied to the input content (e.g., video content). The trained models are additionally invariant to non-temporal transformations (e.g., spatial and/or color-space transformations) applied to the input content. Such representations are employed in various machine learning tasks, such as but not limited to video retrieval (e.g., video search engine applications), identification of actions depicted in video, and temporally ordering clips of the video.

    Learning to Personalize Vision-Language Models through Meta-Personalization

    公开(公告)号:US20240419726A1

    公开(公告)日:2024-12-19

    申请号:US18210535

    申请日:2023-06-15

    Applicant: Adobe Inc.

    Abstract: Techniques for learning to personalize vision-language models through meta-personalization are described. In one embodiment, one or more processing devices lock a pre-trained vision-language model (VLM) during a training phase. The processing devices train the pre-trained VLM to augment a text encoder of the pre-trained VLM with a set of general named video instances to form a meta-personalized VLM, the meta-personalized VLM to include global category features. The processing devices test the meta-personalized VLM to adapt the text encoder with a set of personal named video instances to form a personal VLM, the personal VLM comprising the global category features personalized with a set of personal instance weights to form a personal instance token associated with the user. Other embodiments are described and claimed.

    EQUIVARIANT MODELS FOR GENERATING VECTOR REPRESENTATIONS OF TEMPORALLY-VARYING CONTENT

    公开(公告)号:US20230075087A1

    公开(公告)日:2023-03-09

    申请号:US17466636

    申请日:2021-09-03

    Applicant: ADOBE INC.

    Abstract: The disclosed invention includes systems and methods for training and employing equivariant models for generating representations (e.g., vector representations) of temporally-varying content, such as but not limited to video content. The trained models are equivariant to temporal transformations applied to the input content (e.g., video content). The trained models are additionally invariant to non-temporal transformations (e.g., spatial and/or color-space transformations) applied to the input content. Such representations are employed in various machine learning tasks, such as but not limited to video retrieval (e.g., video search engine applications), identification of actions depicted in video, and temporally ordering clips of the video.

    Retiming digital videos utilizing deep learning

    公开(公告)号:US12112771B2

    公开(公告)日:2024-10-08

    申请号:US18185137

    申请日:2023-03-16

    Applicant: Adobe Inc.

    CPC classification number: G11B27/005 H04N21/23418 H04N21/234381 H04N21/2402

    Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that generate a temporally remapped video that satisfies a desired target duration while preserving natural video dynamics. In certain instances, the disclosed systems utilize a playback speed prediction machine-learning model that recognizes and localizes temporally varying changes in video playback speed to re-time a digital video with varying frame-change speeds. For instance, to re-time the digital video, the disclosed systems utilize the playback speed prediction machine-learning model to infer the slowness of individual video frames. Subsequently, in certain embodiments, the disclosed systems determine, from frames of a digital video, a temporal frame sub-sampling that is consistent with the slowness predictions and fit within a target video duration. In certain implementations, the disclosed systems utilize the temporal frame sub-sampling to generate a speed varying digital video that preserves natural video dynamics while fitting the target video duration.

    RETIMING DIGITAL VIDEOS UTILIZING DEEP LEARNING

    公开(公告)号:US20230276084A1

    公开(公告)日:2023-08-31

    申请号:US18185137

    申请日:2023-03-16

    Applicant: Adobe Inc.

    CPC classification number: H04N21/234381 H04N21/23418 H04N21/2402

    Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that generate a temporally remapped video that satisfies a desired target duration while preserving natural video dynamics. In certain instances, the disclosed systems utilize a playback speed prediction machine-learning model that recognizes and localizes temporally varying changes in video playback speed to re-time a digital video with varying frame-change speeds. For instance, to re-time the digital video, the disclosed systems utilize the playback speed prediction machine-learning model to infer the slowness of individual video frames. Subsequently, in certain embodiments, the disclosed systems determine, from frames of a digital video, a temporal frame sub-sampling that is consistent with the slowness predictions and fit within a target video duration. In certain implementations, the disclosed systems utilize the temporal frame sub-sampling to generate a speed varying digital video that preserves natural video dynamics while fitting the target video duration.

    Retiming digital videos utilizing machine learning and temporally varying speeds

    公开(公告)号:US11610606B1

    公开(公告)日:2023-03-21

    申请号:US17652586

    申请日:2022-02-25

    Applicant: Adobe Inc.

    Abstract: This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that generate a temporally remapped video that satisfies a desired target duration while preserving natural video dynamics. In certain instances, the disclosed systems utilize a playback speed prediction machine-learning model that recognizes and localizes temporally varying changes in video playback speed to re-time a digital video with varying frame-change speeds. For instance, to re-time the digital video, the disclosed systems utilize the playback speed prediction machine-learning model to infer the slowness of individual video frames. Subsequently, in certain embodiments, the disclosed systems determine, from frames of a digital video, a temporal frame sub-sampling that is consistent with the slowness predictions and fit within a target video duration. In certain implementations, the disclosed systems utilize the temporal frame sub-sampling to generate a speed varying digital video that preserves natural video dynamics while fitting the target video duration.

Patent Agency Ranking