SELF-SUPERVISED HIERARCHICAL MOTION LEARNING FOR VIDEO ACTION RECOGNITION

    公开(公告)号:US20210064931A1

    公开(公告)日:2021-03-04

    申请号:US16998914

    申请日:2020-08-20

    Abstract: There are numerous features in video that can be detected using computer-based systems, such as objects and/or motion. The detection of these features, and in particular the detection of motion, has many useful applications, such as action recognition, activity detection, object tracking, etc. The present disclosure provides a neural network that learns motion from unlabeled video frames. In particular, the neural network uses the unlabeled video frames to perform self-supervised hierarchical motion learning. The present disclosure also describes how the learned motion can be used in video action recognition.

    JOINT REPRESENTATION LEARNING FROM IMAGES AND TEXT

    公开(公告)号:US20210056353A1

    公开(公告)日:2021-02-25

    申请号:US17000048

    申请日:2020-08-21

    Abstract: The disclosure provides a framework or system for learning visual representation using a large set of image/text pairs. The disclosure provides, for example, a method of visual representation learning, a joint representation learning system, and an artificial intelligence (AI) system that employs one or more of the trained models from the method or system. The AI system can be used, for example, in autonomous or semi-autonomous vehicles. In one example, the method of visual representation learning includes: (1) receiving a set of image embeddings from an image representation model and a set of text embeddings from a text representation model, and (2) training, employing mutual information, a critic function by learning relationships between the set of image embeddings and the set of text embeddings.

    System and method for optical flow estimation

    公开(公告)号:US10467763B1

    公开(公告)日:2019-11-05

    申请号:US16537986

    申请日:2019-08-12

    Abstract: A method, computer readable medium, and system are disclosed for estimating optical flow between two images. A first pyramidal set of features is generated for a first image and a partial cost volume for a level of the first pyramidal set of features is computed, by a neural network, using features at the level of the first pyramidal set of features and warped features extracted from a second image, where the partial cost volume is computed across a limited range of pixels that is less than a full resolution of the first image, in pixels, at the level. The neural network processes the features and the partial cost volume to produce a refined optical flow estimate for the first image and the second image.

Patent Agency Ranking