Patent search ap:("Nvidia Corporation") AND inv:"Pavlo Molchanov" Page 2

11.

发明授权
3D human body pose estimation using a model trained from unlabeled multi-view data 有权

公开(公告)号：US11417011B2

公开(公告)日：2022-08-16

申请号：US16897057

申请日：2020-06-09

Applicant: NVIDIA Corporation

Inventor： Umar Iqbal , Pavlo Molchanov , Jan Kautz

IPC: G06T7/70 , G06N5/04 , G06T7/50 , G06N20/00

Abstract: Learning to estimate a 3D body pose, and likewise the pose of any type of object, from a single 2D image is of great interest for many practical graphics applications and generally relies on neural networks that have been trained with sample data which annotates (labels) each sample 2D image with a known 3D pose. Requiring this labeled training data however has various drawbacks, including for example that traditionally used training data sets lack diversity and therefore limit the extent to which neural networks are able to estimate 3D pose. Expanding these training data sets is also difficult since it requires manually provided annotations for 2D images, which is time consuming and prone to errors. The present disclosure overcomes these and other limitations of existing techniques by providing a model that is trained from unlabeled multi-view data for use in 3D pose estimation.

12.

发明授权
Transforming convolutional neural networks for visual sequence learning 有权

公开(公告)号：US11049018B2

公开(公告)日：2021-06-29

申请号：US15880472

申请日：2018-01-25

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Pavlo Molchanov , Jan Kautz

IPC: G06N3/08 , G06K9/00 , G06N3/04 , G06K9/62

Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.

13.

发明申请
SYNTHESIZING DATA FOR TRAINING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20210142177A1

公开(公告)日：2021-05-13

申请号：US16682967

申请日：2019-11-13

Applicant: Nvidia Corporation

Inventor： Arun Mallya , Jan Kautz , Zhizhong Li , Pavlo Molchanov , Hongxu Danny Yin

IPC: G06N3/08 , G06N3/04

Abstract: Apparatuses, systems, and techniques are presented to generate data useful for further training of a neural network. In at least one embodiment, one or more neural networks can be re-trained based, at least in part, on data generated by the one or more neural networks including data used to previously train the one or more neural networks.

14.

发明申请
THREE-DIMENSIONAL (3D) POSE ESTIMATION FROM A MONOCULAR CAMERA 有权

公开(公告)号：US20210117661A1

公开(公告)日：2021-04-22

申请号：US17135697

申请日：2020-12-28

Applicant: NVIDIA Corporation

Inventor： Umar Iqbal , Pavlo Molchanov , Thomas Michael Breuel , Jan Kautz

IPC: G06K9/00 , G06N3/08 , G06T7/73 , G06N5/04 , G06T7/579

Abstract: Estimating a three-dimensional (3D) pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is necessary for human-computer interaction. A hand pose can be represented by a set of points in 3D space, called keypoints. Two coordinates (x,y) represent spatial displacement and a third coordinate represents a depth of every point with respect to the camera. A monocular camera is used to capture an image of the 3D pose, but does not capture depth information. A neural network architecture is configured to generate a depth value for each keypoint in the captured image, even when portions of the pose are occluded, or the orientation of the object is ambiguous. Generation of the depth values enables estimation of the 3D pose of the object.

15.

发明授权
Equivariant landmark transformation for landmark localization 有权

公开(公告)号：US10783394B2

公开(公告)日：2020-09-22

申请号：US16006728

申请日：2018-06-12

Applicant: NVIDIA Corporation

Inventor： Pavlo Molchanov , Stephen Walter Tyree , Jan Kautz , Sina Honari

IPC: G06K9/00 , G06K9/62 , G06K9/46 , G06N3/08 , G06K9/66 , G06N3/04

Abstract: A method, computer readable medium, and system are disclosed to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A transform is applied to input image data to produce transformed input image data. The transform is also applied to predicted coordinates for landmarks of the input image data to produce transformed predicted coordinates. A neural network model processes the transformed input image data to generate additional landmarks of the transformed input image data and additional predicted coordinates for each one of the additional landmarks. Parameters of the neural network model are updated to reduce differences between the transformed predicted coordinates and the additional predicted coordinates.

16.

发明申请
BUDGET-AWARE METHOD FOR DETECTING ACTIVITY IN VIDEO 审中-公开

公开(公告)号：US20190163978A1

公开(公告)日：2019-05-30

申请号：US16202703

申请日：2018-11-28

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Pavlo Molchanov , Jan Kautz , Behrooz Mahasseni

IPC: G06K9/00 , G06K9/62

Abstract: Detection of activity in video content, and more particularly detecting in video start and end frames inclusive of an activity and a classification for the activity, is fundamental for video analytics including categorizing, searching, indexing, segmentation, and retrieval of videos. Existing activity detection processes rely on a large set of features and classifiers that exhaustively run over every time step of a video at multiple temporal scales, or as a small improvement computationally propose segments of the video on which to perform classification. These existing activity detection processes, however, are computationally expensive, particularly when trying to achieve activity detection accuracy, and moreover are not configurable for any particular time or computation budget. The present disclosure provides a time and/or computation budget-aware method for detecting activity in video that relies on a recurrent neural network implementing a learned policy.

17.

发明申请
SEMI-SUPERVISED LEARNING FOR LANDMARK LOCALIZATION 审中-公开

公开(公告)号：US20180365532A1

公开(公告)日：2018-12-20

申请号：US16006709

申请日：2018-06-12

Applicant: NVIDIA Corporation

Inventor： Pavlo Molchanov , Stephen Walter Tyree , Jan Kautz , Sina Honari

IPC: G06K9/62 , G06K9/00 , G06N3/04 , G06N3/08

Abstract: A method, computer readable medium, and system are disclosed for sequential multi-tasking to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A neural network model processes input image data to generate pixel-level likelihood estimates for landmarks in the input image data and a soft-argmax function computes predicted coordinates of each landmark based on the pixel-level likelihood estimates.

18.

发明申请
EQUIVARIANT LANDMARK TRANSFORMATION FOR LANDMARK LOCALIZATION 审中-公开

公开(公告)号：US20180365512A1

公开(公告)日：2018-12-20

申请号：US16006728

申请日：2018-06-12

Applicant: NVIDIA Corporation

Inventor： Pavlo Molchanov , Stephen Walter Tyree , Jan Kautz , Sina Honari

IPC: G06K9/46 , G06K9/66 , G06K9/62 , G06N3/08

Abstract: A method, computer readable medium, and system are disclosed to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A transform is applied to input image data to produce transformed input image data. The transform is also applied to predicted coordinates for landmarks of the input image data to produce transformed predicted coordinates. A neural network model processes the transformed input image data to generate additional landmarks of the transformed input image data and additional predicted coordinates for each one of the additional landmarks. Parameters of the neural network model are updated to reduce differences between the transformed predicted coordinates and the additional predicted coordinates.

19.

发明授权
Performing occlusion-aware global 3D pose and shape estimation of articulated objects 有权

公开(公告)号：US12100113B2

公开(公告)日：2024-09-24

申请号：US17584213

申请日：2022-01-25

Applicant: NVIDIA Corporation

Inventor： Ye Yuan , Umar Iqbal , Pavlo Molchanov , Jan Kautz

IPC: G06T19/20 , G06T7/00 , G06T7/20

CPC classification number: G06T19/20 , G06T7/0002 , G06T7/20 , G06T2207/10016 , G06T2207/20084 , G06T2207/30241 , G06T2219/2016

Abstract: In order to determine accurate three-dimensional (3D) models for objects within a video, the objects are first identified and tracked within the video, and a pose and shape are estimated for these tracked objects. A translation and global orientation are removed from the tracked objects to determine local motion for the objects, and motion infilling is performed to fill in any missing portions for the object within the video. A global trajectory is then determined for the objects within the video, and the infilled motion and global trajectory are then used to determine infilled global motion for the object within the video. This enables the accurate depiction of each object as a 3D pose sequence for that model that accounts for occlusions and global factors within the video.

20.

发明公开
DYNAMIC NEURAL NETWORK MODEL SPARSIFICATION 审中-公开

公开(公告)号：US20240119291A1

公开(公告)日：2024-04-11

申请号：US18203552

申请日：2023-05-30

Applicant: NVIDIA Corporation

Inventor： Jose M. Alvarez Lopez , Pavlo Molchanov , Hongxu Yin , Maying Shen , Lei Mao , Xinglong Sun

IPC: G06N3/082 , G06N3/0495

CPC classification number: G06N3/082 , G06N3/0495

Abstract: Machine learning is a process that learns a neural network model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the size, computation, and latency of a neural network model, a compression technique can be employed which includes model sparsification. To avoid the negative consequences of pruning a fully pretrained neural network model and on the other hand of training a sparse model in the first place without any recovery option, the present disclosure provides a dynamic neural network model sparsification process which allows for recovery of previously pruned parts to improve the quality of the sparse neural network model.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification