TABULAR DATA GENERATION
    1.
    发明申请

    公开(公告)号:US20250124220A1

    公开(公告)日:2025-04-17

    申请号:US18911044

    申请日:2024-10-09

    Abstract: A tabular data model, which may be pre-trained on a different data set, is used to generate data samples for a target class with a given set of context data points. The tabular data model is trained to predict class membership of a given data point with a set of context data points. Rather than use the predicted class directly, the class predictions are used to determine a class-conditional energy for a synthetic data point with respect to the target class. The synthetic data point may then be updated based on the class-conditional energy with a stochastic update algorithm, such as stochastic gradient Langevin dynamics or Adaptive Moment Estimation with noise. The value of the synthetic data point is sampled as a data point for the target class. This permits effective data augmentation for tabular data for downstream models.

    Weakly supervised action selection learning in video

    公开(公告)号:US12211274B2

    公开(公告)日:2025-01-28

    申请号:US17716996

    申请日:2022-04-08

    Abstract: A video localization system localizes actions in videos based on a classification model and an actionness model. The classification model is trained to make predictions of which segments of a video depict an action and to classify the actions in the segments. The actionness model predicts whether any action is occurring in each segment, rather than predicting a particular type of action. This reduces the likelihood that the video localization system over-relies on contextual information in localizing actions in video. Furthermore, the classification model and the actionness model are trained based on weakly-labeled data, thereby reducing the cost and time required to generate training data for the video localization system.

    TEXT-CONDITIONED VIDEO REPRESENTATION
    3.
    发明公开

    公开(公告)号:US20230351753A1

    公开(公告)日:2023-11-02

    申请号:US17894738

    申请日:2022-08-24

    CPC classification number: G06V20/47 G06V20/41

    Abstract: A text-video recommendation model determines relevance of a text to a video in a text-video pair (e.g., as a relevance score) with a text embedding and a text-conditioned video embedding. The text-conditioned video embedding is a representation of the video used for evaluating the relevance of the video to the text, where the representation itself is a function of the text it is evaluated for. As such, the input text may be used to weigh or attend to different frames of the video in determining the text-conditioned video embedding. The representation of the video may thus differ for different input texts for comparison. The text-conditioned video embedding may be determined in various ways, such as with a set of the most-similar frames to the input text (the top-k frames) or may be based on an attention function based on query, key, and value projections.

    Weakly Supervised Action Selection Learning in Video

    公开(公告)号:US20220335718A1

    公开(公告)日:2022-10-20

    申请号:US17716996

    申请日:2022-04-08

    Abstract: A video localization system localizes actions in videos based on a classification model and an actionness model. The classification model is trained to make predictions of which segments of a video depict an action and to classify the actions in the segments. The actionness model predicts whether any action is occurring in each segment, rather than predicting a particular type of action. This reduces the likelihood that the video localization system over-relies on contextual information in localizing actions in video. Furthermore, the classification model and the actionness model are trained based on weakly-labeled data, thereby reducing the cost and time required to generate training data for the video localization system.

    WEAKLY SUPERVISED ACTION SELECTION LEARNING IN VIDEO

    公开(公告)号:US20250131718A1

    公开(公告)日:2025-04-24

    申请号:US18988381

    申请日:2024-12-19

    Abstract: A video localization system localizes actions in videos based on a classification model and an actionness model. The classification model is trained to make predictions of which segments of a video depict an action and to classify the actions in the segments. The actionness model predicts whether any action is occurring in each segment, rather than predicting a particular type of action. This reduces the likelihood that the video localization system over-relies on contextual information in localizing actions in video. Furthermore, the classification model and the actionness model are trained based on weakly-labeled data, thereby reducing the cost and time required to generate training data for the video localization system.

Patent Agency Ranking