Weakly supervised action selection learning in video

    公开(公告)号:US12211274B2

    公开(公告)日:2025-01-28

    申请号:US17716996

    申请日:2022-04-08

    Abstract: A video localization system localizes actions in videos based on a classification model and an actionness model. The classification model is trained to make predictions of which segments of a video depict an action and to classify the actions in the segments. The actionness model predicts whether any action is occurring in each segment, rather than predicting a particular type of action. This reduces the likelihood that the video localization system over-relies on contextual information in localizing actions in video. Furthermore, the classification model and the actionness model are trained based on weakly-labeled data, thereby reducing the cost and time required to generate training data for the video localization system.

    WEAKLY SUPERVISED ACTION SELECTION LEARNING IN VIDEO

    公开(公告)号:US20250131718A1

    公开(公告)日:2025-04-24

    申请号:US18988381

    申请日:2024-12-19

    Abstract: A video localization system localizes actions in videos based on a classification model and an actionness model. The classification model is trained to make predictions of which segments of a video depict an action and to classify the actions in the segments. The actionness model predicts whether any action is occurring in each segment, rather than predicting a particular type of action. This reduces the likelihood that the video localization system over-relies on contextual information in localizing actions in video. Furthermore, the classification model and the actionness model are trained based on weakly-labeled data, thereby reducing the cost and time required to generate training data for the video localization system.

    PROGRAMMATIC ANALYSIS AND MONITORING OF PIPELINED MACHINE LEARNING PROCESSES IN DISTRIBUTED COMPUTING ENVIRONMENTS

    公开(公告)号:US20240386326A1

    公开(公告)日:2024-11-21

    申请号:US18665288

    申请日:2024-05-15

    Abstract: The disclosed embodiments include computer-implemented processes and systems that establish configurable pipelines for training and deploying machine-learning processes in distributed computing environments. By way of example, an apparatus may execute sequentially a plurality of application engines within a training pipeline in accordance with first configuration data, and the executed application engines may cause the at least one processor to perform operations that train a machine-learning process based on corresponding ones of a plurality of partitioned datasets. Based on artifact data associated with the sequential execution of the application engines, the apparatus may generate elements of explainability data that characterize the training of the machine-learning process within the training pipeline and in accordance with second configuration data, and transmit the explainability data to a computing system. The computer system may generate at least a portion of the second configuration data.

    TEXT-CONDITIONED VIDEO REPRESENTATION
    5.
    发明公开

    公开(公告)号:US20230351753A1

    公开(公告)日:2023-11-02

    申请号:US17894738

    申请日:2022-08-24

    CPC classification number: G06V20/47 G06V20/41

    Abstract: A text-video recommendation model determines relevance of a text to a video in a text-video pair (e.g., as a relevance score) with a text embedding and a text-conditioned video embedding. The text-conditioned video embedding is a representation of the video used for evaluating the relevance of the video to the text, where the representation itself is a function of the text it is evaluated for. As such, the input text may be used to weigh or attend to different frames of the video in determining the text-conditioned video embedding. The representation of the video may thus differ for different input texts for comparison. The text-conditioned video embedding may be determined in various ways, such as with a set of the most-similar frames to the input text (the top-k frames) or may be based on an attention function based on query, key, and value projections.

    Weakly Supervised Action Selection Learning in Video

    公开(公告)号:US20220335718A1

    公开(公告)日:2022-10-20

    申请号:US17716996

    申请日:2022-04-08

    Abstract: A video localization system localizes actions in videos based on a classification model and an actionness model. The classification model is trained to make predictions of which segments of a video depict an action and to classify the actions in the segments. The actionness model predicts whether any action is occurring in each segment, rather than predicting a particular type of action. This reduces the likelihood that the video localization system over-relies on contextual information in localizing actions in video. Furthermore, the classification model and the actionness model are trained based on weakly-labeled data, thereby reducing the cost and time required to generate training data for the video localization system.

Patent Agency Ranking