METHOD FOR ANNOTATING POINTS ON A HAND IMAGE TO CREATE TRAINING DATASET FOR MACHINE LEARNING

    公开(公告)号:US20220198747A1

    公开(公告)日:2022-06-23

    申请号:US17129362

    申请日:2020-12-21

    Inventor: Kim SAVAROCHE

    Abstract: A method for annotating points on a 2D image of a hand, includes capturing several images of the hand from different views; for each viewpoint, the hand is imaged using cameras including: one first 2D camera and one 3D camera; using a 3D engine for: creating a 3D hand representation from the 3D camera; considering a 3D model of an articulated hand with predefined annotation points; considering several 3D viewpoints of the hand; for each viewpoint considered: modifying the articulated hand to be superimposed with the 3D representation; considering a 2D image captured from the first 2D camera; superimposing the modified articulated hand on the hand captured on the 2D image; applying the annotation points of the modified articulated hand on the hand captured on the 2D image; and storing the 2D image with annotation on the hand

    VIDEO REPRESENTATION LEARNING
    147.
    发明申请

    公开(公告)号:US20220156514A1

    公开(公告)日:2022-05-19

    申请号:US17454743

    申请日:2021-11-12

    Abstract: Certain aspects of the present disclosure provide techniques for training a first model based on a first labeled video dataset; generating a plurality of action-words based on output generated by the first model processing motion data in videos of an unlabeled video dataset; defining labels for the videos in the unlabeled video dataset based on the generated action-words; and training a second model based on the labels for the videos in the unlabeled video dataset.

    LINGUALLY CONSTRAINED TRACKING OF VISUAL OBJECTS

    公开(公告)号:US20220156502A1

    公开(公告)日:2022-05-19

    申请号:US17526969

    申请日:2021-11-15

    Abstract: A computer-implemented method for tracking with visual object constraints includes receiving a lingual constraint and a video. A word embedding is generated based on the lingual constraint. A set of features is extracted for one or more frames of the video. The word embedding is cross-correlated to the set of features for the one or more frames of the video. A prediction indicating whether the lingual constraint is in the one or more frames of the video is generated based on the cross-correlation.

    VIDEO PROCESSING USING A SPECTRAL DECOMPOSITION LAYER

    公开(公告)号:US20220132050A1

    公开(公告)日:2022-04-28

    申请号:US17572510

    申请日:2022-01-10

    Abstract: A method is presented. The method includes receiving a first sequence of frames depicting a dynamic element. The method also includes decomposing each spatial position from multiple spatial positions in the first sequence of frames to a frequency domain. The method further includes determining a distribution of spectral power density over a range of frequencies of the multiple spatial positions. The method still further includes generating a first set of feature maps based on the determined distribution of spectral power density over the range of frequencies. The method still further includes estimating a first physical property of the dynamic element.

Patent Agency Ranking