Patent search ap:("Nvidia Corporation") AND inv:"Xiaodong Yang" Page 2

11.

发明申请
CROSS-DOMAIN IMAGE PROCESSING FOR OBJECT RE-IDENTIFICATION 有权

公开(公告)号：US20210064907A1

公开(公告)日：2021-03-04

申请号：US16998890

申请日：2020-08-20

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Yang Zou , Zhiding Yu , Jan Kautz

IPC: G06K9/44 , G06K9/46 , G06K9/62

Abstract: Object re-identification refers to a process by which images that contain an object of interest are retrieved from a set of images captured using disparate cameras or in disparate environments. Object re-identification has many useful applications, particularly as it is applied to people (e.g. person tracking). Current re-identification processes rely on convolutional neural networks (CNNs) that learn re-identification for a particular object class from labeled training data specific to a certain domain (e.g. environment), but that do not apply well in other domains. The present disclosure provides cross-domain disentanglement of id-related and id-unrelated factors. In particular, the disentanglement is performed using a labeled image set and an unlabeled image set, respectively captured from different domains but for a same object class. The identification-related features may then be used to train a neural network to perform re-identification of objects in that object class from images captured from the second domain.

12.

发明授权
Budget-aware method for detecting activity in video 有权

公开(公告)号：US10860859B2

公开(公告)日：2020-12-08

申请号：US16202703

申请日：2018-11-28

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Pavlo Molchanov , Jan Kautz , Behrooz Mahasseni

IPC: G06T7/194 , G06K9/00 , G06K9/62

Abstract: Detection of activity in video content, and more particularly detecting in video start and end frames inclusive of an activity and a classification for the activity, is fundamental for video analytics including categorizing, searching, indexing, segmentation, and retrieval of videos. Existing activity detection processes rely on a large set of features and classifiers that exhaustively run over every time step of a video at multiple temporal scales, or as a small improvement computationally propose segments of the video on which to perform classification. These existing activity detection processes, however, are computationally expensive, particularly when trying to achieve activity detection accuracy, and moreover are not configurable for any particular time or computation budget. The present disclosure provides a time and/or computation budget-aware method for detecting activity in video that relies on a recurrent neural network implementing a learned policy.

13.

发明申请
TRANSFORMING CONVOLUTIONAL NEURAL NETWORKS FOR VISUAL SEQUENCE LEARNING 审中-公开

公开(公告)号：US20180373985A1

公开(公告)日：2018-12-27

申请号：US15880472

申请日：2018-01-25

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Pavlo Molchanov , Jan Kautz

IPC: G06N3/08 , G06N3/04

Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.

14.

发明授权
Online detection and classification of dynamic gestures with recurrent convolutional neural networks 有权

公开(公告)号：US10157309B2

公开(公告)日：2018-12-18

申请号：US15402128

申请日：2017-01-09

Applicant: NVIDIA Corporation

Inventor： Pavlo Molchanov , Xiaodong Yang , Shalini De Mello , Kihwan Kim , Stephen Walter Tyree , Jan Kautz

IPC: G06K9/00 , G06K9/62 , G06N3/04 , G06N3/08

Abstract: A method, computer readable medium, and system are disclosed for detecting and classifying hand gestures. The method includes the steps of receiving an unsegmented stream of data associated with a hand gesture, extracting spatio-temporal features from the unsegmented stream by a three-dimensional convolutional neural network (3DCNN), and producing a class label for the hand gesture based on the spatio-temporal features.

15.

发明申请
FUSING MULTILAYER AND MULTIMODAL DEEP NEURAL NETWORKS FOR VIDEO CLASSIFICATION 审中-公开

公开(公告)号：US20180032846A1

公开(公告)日：2018-02-01

申请号：US15660719

申请日：2017-07-26

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Pavlo Molchanov , Jan Kautz

IPC: G06K9/62 , G06N3/04 , G06K9/46 , G06K9/00 , G06K9/66

CPC classification number: G06K9/6293 , G06K9/00711 , G06K9/00718 , G06K9/00744 , G06K9/4604 , G06K9/4628 , G06K9/66 , G06N3/04 , G06N3/0445 , G06N3/0454 , G06N3/08 , G06N20/10

Abstract: A method, computer readable medium, and system are disclosed for classifying video image data. The method includes the steps of processing training video image data by at least a first layer of a convolutional neural network (CNN) to extract a first set of feature maps and generate classification output data for the training video image data. Spatial classification accuracy data is computed based on the classification output data and target classification output data and spatial discrimination factors for the first layer are computed based on the spatial classification accuracies and the first set of feature maps.

16.

发明授权
Transforming convolutional neural networks for visual sequence learning 有权

公开(公告)号：US11645530B2

公开(公告)日：2023-05-09

申请号：US17325024

申请日：2021-05-19

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Pavlo Molchanov , Jan Kautz

IPC: G06N3/082 , G06V20/40 , G06V10/764 , G06V10/82 , G06F18/24 , G06N3/044 , G06N3/045 , G06N3/048

CPC classification number: G06N3/082 , G06F18/24 , G06N3/044 , G06N3/045 , G06N3/048 , G06V10/764 , G06V10/82 , G06V20/41

Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.

17.

发明授权
Iterative spatio-temporal action detection in video 有权

公开(公告)号：US11631239B2

公开(公告)日：2023-04-18

申请号：US17237728

申请日：2021-04-22

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Ming-Yu Liu , Jan Kautz , Fanyi Xiao , Xitong Yang

IPC: G06T7/73 , G06V10/82 , G06T7/277 , G06V40/20 , G06V10/25 , G06V10/764

Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.

18.

发明申请
IMAGE IDENTIFICATION USING NEURAL NETWORKS 审中-公开

公开(公告)号：US20200302176A1

公开(公告)日：2020-09-24

申请号：US16357047

申请日：2019-03-18

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Zhedong Zheng , Zhiding Yu

IPC: G06K9/00 , G06N3/04 , G06N3/08 , G06F7/57

Abstract: A neural network is trained to perform a re-identification task in which it is determined whether one or more features present in a first image appear also in a second image. During training, a generative portion of one or more neural networks generates variations of an input image, and a discriminative portion of the one or more neural networks learns to perform the re-identification task based at least in part on the variations of the image. During training, the generative and discriminative portions of the one or more neural networks share an encoder which encodes information used by the generative and discriminative portions.

19.

发明授权
System and method for optical flow estimation 有权

公开(公告)号：US10424069B2

公开(公告)日：2019-09-24

申请号：US15942213

申请日：2018-03-30

Applicant: NVIDIA Corporation

Inventor： Deqing Sun , Xiaodong Yang , Ming-Yu Liu , Jan Kautz

IPC: G06T7/207 , G06N5/04 , G06T3/00 , G06T7/00 , G06T7/246 , G06N3/04 , G06N3/08

Abstract: A method, computer readable medium, and system are disclosed for estimating optical flow between two images. A first pyramidal set of features is generated for a first image and a partial cost volume for a level of the first pyramidal set of features is computed, by a neural network, using features at the level of the first pyramidal set of features and warped features extracted from a second image, where the partial cost volume is computed across a limited range of pixels that is less than a full resolution of the first image, in pixels, at the level. The neural network processes the features and the partial cost volume to produce a refined optical flow estimate for the first image and the second image.

20.

发明申请
SYSTEMS AND METHODS FOR DYNAMIC FACIAL ANALYSIS USING A RECURRENT NEURAL NETWORK 审中-公开

公开(公告)号：US20190180469A1

公开(公告)日：2019-06-13

申请号：US15836549

申请日：2017-12-08

Applicant: NVIDIA Corporation

Inventor： Jinwei Gu , Xiaodong Yang , Shalini De Mello , Jan Kautz

IPC: G06T7/73 , G06N3/08

CPC classification number: G06T7/73 , G06N3/08 , G06T3/4046 , G06T13/40 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , G06T2207/30201 , G06T2207/30204

Abstract: A method, computer readable medium, and system are disclosed for dynamic facial analysis. The method includes the steps of receiving video data representing a sequence of image frames including at least one head and extracting, by a neural network, spatial features comprising pitch, yaw, and roll angles of the at least one head from the video data. The method also includes the step of processing, by a recurrent neural network, the spatial features for two or more image frames in the sequence of image frames to produce head pose estimates for the at least one head.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification