Patent search ap:("Nvidia Corporation") AND inv:"Xiaodong Yang" Page 3

21.

发明申请
SELF-SUPERVISED HIERARCHICAL MOTION LEARNING FOR VIDEO ACTION RECOGNITION 有权

公开(公告)号：US20210064931A1

公开(公告)日：2021-03-04

申请号：US16998914

申请日：2020-08-20

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Xitong Yang , Sifei Liu , Jan Kautz

IPC: G06K9/62 , G06K9/72 , G06K9/00 , G06N3/08 , G06N3/04

Abstract: There are numerous features in video that can be detected using computer-based systems, such as objects and/or motion. The detection of these features, and in particular the detection of motion, has many useful applications, such as action recognition, activity detection, object tracking, etc. The present disclosure provides a neural network that learns motion from unlabeled video frames. In particular, the neural network uses the unlabeled video frames to perform self-supervised hierarchical motion learning. The present disclosure also describes how the learned motion can be used in video action recognition.

22.

发明申请
JOINT REPRESENTATION LEARNING FROM IMAGES AND TEXT 有权

公开(公告)号：US20210056353A1

公开(公告)日：2021-02-25

申请号：US17000048

申请日：2020-08-21

Applicant: Nvidia Corporation

Inventor： Arash Vahdat , Tanmay Gupta , Xiaodong Yang , Jan Kautz

IPC: G06K9/62 , G06N3/08

Abstract: The disclosure provides a framework or system for learning visual representation using a large set of image/text pairs. The disclosure provides, for example, a method of visual representation learning, a joint representation learning system, and an artificial intelligence (AI) system that employs one or more of the trained models from the method or system. The AI system can be used, for example, in autonomous or semi-autonomous vehicles. In one example, the method of visual representation learning includes: (1) receiving a set of image embeddings from an image representation model and a set of text embeddings from a text representation model, and (2) training, employing mutual information, a critic function by learning relationships between the set of image embeddings and the set of text embeddings.

23.

发明授权
System and method for optical flow estimation 有权

公开(公告)号：US10467763B1

公开(公告)日：2019-11-05

申请号：US16537986

申请日：2019-08-12

Applicant: NVIDIA Corporation

Inventor： Deqing Sun , Xiaodong Yang , Ming-Yu Liu , Jan Kautz

IPC: G06T7/207 , G06T7/246 , G06T7/00 , G06N5/04 , G06T3/00

Abstract: A method, computer readable medium, and system are disclosed for estimating optical flow between two images. A first pyramidal set of features is generated for a first image and a partial cost volume for a level of the first pyramidal set of features is computed, by a neural network, using features at the level of the first pyramidal set of features and warped features extracted from a second image, where the partial cost volume is computed across a limited range of pixels that is less than a full resolution of the first image, in pixels, at the level. The neural network processes the features and the partial cost volume to produce a refined optical flow estimate for the first image and the second image.

24.

发明授权
Fusing multilayer and multimodal deep neural networks for video classification 有权

公开(公告)号：US10402697B2

公开(公告)日：2019-09-03

申请号：US15660719

申请日：2017-07-26

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Pavlo Molchanov , Jan Kautz

IPC: G06K9/62 , G06K9/00 , G06K9/66 , G06K9/46 , G06N3/04 , G06N3/08 , G06N20/10

Abstract: A method, computer readable medium, and system are disclosed for classifying video image data. The method includes the steps of processing training video image data by at least a first layer of a convolutional neural network (CNN) to extract a first set of feature maps and generate classification output data for the training video image data. Spatial classification accuracy data is computed based on the classification output data and target classification output data and spatial discrimination factors for the first layer are computed based on the spatial classification accuracies and the first set of feature maps.

25.

发明申请
SYSTEM AND METHOD FOR CONTENT AND MOTION CONTROLLED ACTION VIDEO GENERATION 审中-公开

公开(公告)号：US20180288431A1

公开(公告)日：2018-10-04

申请号：US15939098

申请日：2018-03-28

Applicant: NVIDIA Corporation

Inventor： Ming-Yu Liu , Xiaodong Yang , Jan Kautz , Sergey Tulyakov

IPC: H04N19/513 , G06K9/00 , G06N3/08 , G06T13/40

CPC classification number: H04N19/521 , G06K9/00201 , G06K9/00281 , G06N3/0445 , G06N3/0454 , G06N3/0472 , G06N3/08 , G06T13/40 , G06T2207/20081 , G06T2207/30196

Abstract: A method, computer readable medium, and system are disclosed for action video generation. The method includes the steps of generating, by a recurrent neural network, a sequence of motion vectors from a first set of random variables and receiving, by a generator neural network, the sequence of motion vectors and a content vector sample. The sequence of motion vectors and the content vector sample are sampled by the generator neural network to produce a video clip.

26.

发明申请
ONLINE DETECTION AND CLASSIFICATION OF DYNAMIC GESTURES WITH RECURRENT CONVOLUTIONAL NEURAL NETWORKS 审中-公开

公开(公告)号：US20170206405A1

公开(公告)日：2017-07-20

申请号：US15402128

申请日：2017-01-09

Applicant: NVIDIA Corporation

Inventor： Pavlo Molchanov , Xiaodong Yang , Shalini De Mello , Kihwan Kim , Stephen Walter Tyree , Jan Kautz

IPC: G06K9/00 , G06K9/62

CPC classification number: G06K9/00355 , G06K9/00201 , G06K9/00765 , G06K9/4628 , G06K9/4652 , G06K9/6251 , G06K9/6256 , G06K9/627 , G06K9/6277 , G06N3/0445 , G06N3/0454 , G06N3/084 , Y04S10/54

Abstract: A method, computer readable medium, and system are disclosed for detecting and classifying hand gestures. The method includes the steps of receiving an unsegmented stream of data associated with a hand gesture, extracting spatio-temporal features from the unsegmented stream by a three-dimensional convolutional neural network (3DCNN), and producing a class label for the hand gesture based on the spatio-temporal features.

Patent Agency Ranking