Patent search ap:("NVIDIA Corporation") AND inv:"Jan Kautz" Page 9

81.

发明申请
MACHINE LEARNING FRAMEWORK APPLIED IN A SEMI-SUPERVISED SETTING TO PERFORM INSTANCE TRACKING IN A SEQUENCE OF IMAGE FRAMES 有权

公开(公告)号：US20220222832A1

公开(公告)日：2022-07-14

申请号：US17570254

申请日：2022-01-06

Applicant: NVIDIA Corporation

Inventor： Yang Fu , Sifei Liu , Umar Iqbal , Shalini De Mello , Jan Kautz

IPC: G06T7/246 , G06T7/73 , G06T7/11 , G06V10/764 , G06V10/774 , G06V10/77 , G06V10/82

Abstract: A method and system are provided for tracking instances within a sequence of video frames. The method includes the steps of processing an image frame by a backbone network to generate a set of feature maps, processing the set of feature maps by one or more prediction heads, and analyzing the embedding features corresponding to a set of instances in two or more image frames of the sequence of video frames to establish a one-to-one correlation between instances in different image frames. The one or more prediction heads includes an embedding head configured to generate a set of embedding features corresponding to one or more instances of an object identified in the image frame. The method may also include training the one or more prediction heads using a set of annotated image frames and/or a plurality of sequences of unlabeled video frames.

82.

发明授权
Switchable propagation neural network 有权

公开(公告)号：US11328169B2

公开(公告)日：2022-05-10

申请号：US16353835

申请日：2019-03-14

Applicant: NVIDIA Corporation

Inventor： Sifei Liu , Shalini De Mello , Jinwei Gu , Varun Jampani , Jan Kautz

IPC: G06K9/62 , G06V20/40 , G06T7/90 , G06T5/00 , G06T7/10 , G06N3/08 , G06N3/04 , G06T5/50

Abstract: A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.

83.

发明申请
DETERMINING A 3-D HAND POSE FROM A 2-D IMAGE USING MACHINE LEARNING 有权

公开(公告)号：US20210233273A1

公开(公告)日：2021-07-29

申请号：US16752225

申请日：2020-01-24

Applicant: NVIDIA Corporation

Inventor： Adrian Spurr , Pavlo Molchanov , Umar Iqbal , Jan Kautz

IPC: G06T7/73 , G06K9/00 , G06K9/62

Abstract: Apparatuses, systems, and techniques that determine the pose of a human hand from a 2-D image are described herein. In at least one embodiment, training of a neural network is augmented using weakly labeled or unlabeled pose data which is augmented with losses based on a human hand model.

84.

发明申请
SWITCHABLE PROPAGATION NEURAL NETWORK 有权

公开(公告)号：US20210073575A1

公开(公告)日：2021-03-11

申请号：US17081805

申请日：2020-10-27

Applicant: NVIDIA Corporation

Inventor： Sifei Liu , Shalini De Mello , Jinwei Gu , Varun Jampani , Jan Kautz

IPC: G06K9/62 , G06K9/00 , G06T7/90 , G06T5/00 , G06T7/10 , G06N3/08 , G06N3/04 , G06T5/50

Abstract: A temporal propagation network (TPN) system learns the affinity matrix for video image processing tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The TPN system includes a guidance neural network model and a temporal propagation module and is trained for a particular computer vision task to propagate visual properties from a key-frame represented by dense data (color), to another frame that is represented by coarse data (grey-scale). The guidance neural network model generates an affinity matrix referred to as a global transformation matrix from task-specific data for the key-frame and the other frame. The temporal propagation module applies the global transformation matrix to the key-frame property data to produce propagated property data (color) for the other frame. For example, the TPN system may be used to colorize several frames of greyscale video using a single manually colorized key-frame.

85.

发明申请
VIDEO INTERPOLATION USING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20210067735A1

公开(公告)日：2021-03-04

申请号：US16559312

申请日：2019-09-03

Applicant: Nvidia Corporation

Inventor： Fitsum Reda , Deqing Sun , Aysegul Dundar , Mohammad Shoeybi , Guilin Liu , Kevin Shih , Andrew Tao , Jan Kautz , Bryan Catanzaro

IPC: H04N7/01 , G06N3/04 , G06N3/08

Abstract: Apparatuses, systems, and techniques to enhance video. In at least one embodiment, one or more neural networks are used to create, from a first video, a second video having a higher frame rate, higher resolution, or reduced number of missing or corrupt video frames.

86.

发明授权
Three-dimensional (3D) pose estimation from a monocular camera 有权

公开(公告)号：US10929654B2

公开(公告)日：2021-02-23

申请号：US16290643

申请日：2019-03-01

Applicant: NVIDIA Corporation

Inventor： Umar Iqbal , Pavlo Molchanov , Thomas Michael Breuel , Jan Kautz

IPC: G06K9/00 , G06N3/08 , G06T7/73 , G06N5/04 , G06T7/579

Abstract: Estimating a three-dimensional (3D) pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is necessary for human-computer interaction. A hand pose can be represented by a set of points in 3D space, called keypoints. Two coordinates (x,y) represent spatial displacement and a third coordinate represents a depth of every point with respect to the camera. A monocular camera is used to capture an image of the 3D pose, but does not capture depth information. A neural network architecture is configured to generate a depth value for each keypoint in the captured image, even when portions of the pose are occluded, or the orientation of the object is ambiguous. Generation of the depth values enables estimation of the 3D pose of the object.

87.

发明授权
Guided hallucination for missing image content using a neural network 有权

公开(公告)号：US10922793B2

公开(公告)日：2021-02-16

申请号：US16353195

申请日：2019-03-14

Applicant: NVIDIA Corporation

Inventor： Seung-Hwan Baek , Kihwan Kim , Jinwei Gu , Orazio Gallo , Alejandro Jose Troccoli , Ming-Yu Liu , Jan Kautz

IPC: G06T5/00 , G06K9/72 , G06T5/50 , G06K9/62 , G06T3/40

Abstract: Missing image content is generated using a neural network. In an embodiment, a high resolution image and associated high resolution semantic label map are generated from a low resolution image and associated low resolution semantic label map. The input image/map pair (low resolution image and associated low resolution semantic label map) lacks detail and is therefore missing content. Rather than simply enhancing the input image/map pair, data missing in the input image/map pair is improvised or hallucinated by a neural network, creating plausible content while maintaining spatio-temporal consistency. Missing content is hallucinated to generate a detailed zoomed in portion of an image. Missing content is hallucinated to generate different variations of an image, such as different seasons or weather conditions for a driving video.

88.

发明申请
FEW-SHOT TRAINING OF A NEURAL NETWORK 审中-公开

公开(公告)号：US20200334543A1

公开(公告)日：2020-10-22

申请号：US16389832

申请日：2019-04-19

Applicant: NVIDIA Corporation

Inventor： Seonwook Park , Shalini De Mello , Pavlo Molchanov , Umar Iqbal , Jan Kautz

IPC: G06N3/08 , G06F17/18 , G06N3/04 , G06F7/57

Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.

89.

发明授权
Deep-learning method for separating reflection and transmission images visible at a semi-reflective surface in a computer image of a real-world scene 有权

公开(公告)号：US10762620B2

公开(公告)日：2020-09-01

申请号：US16200192

申请日：2018-11-26

Applicant: NVIDIA Corporation

Inventor： Orazio Gallo , Jinwei Gu , Jan Kautz , Patrick Wieschollek

IPC: G06T7/00 , G06K9/62 , G06T1/20 , G06N20/00 , G06T11/40 , G06T15/00 , G06T5/00

Abstract: When a computer image is generated from a real-world scene having a semi-reflective surface (e.g. window), the computer image will create, at the semi-reflective surface from the viewpoint of the camera, both a reflection of a scene in front of the semi-reflective surface and a transmission of a scene located behind the semi-reflective surface. Similar to a person viewing the real-world scene from different locations, angles, etc., the reflection and transmission may change, and also move relative to each other, as the viewpoint of the camera changes. Unfortunately, the dynamic nature of the reflection and transmission negatively impacts the performance of many computer applications, but performance can generally be improved if the reflection and transmission are separated. The present disclosure uses deep learning to separate reflection and transmission at a semi-reflective surface of a computer image generated from a real-world scene.

90.

发明申请
FUTURE OBJECT TRAJECTORY PREDICTIONS FOR AUTONOMOUS MACHINE APPLICATIONS 审中-公开

公开(公告)号：US20200082248A1

公开(公告)日：2020-03-12

申请号：US16564978

申请日：2019-09-09

Applicant: NVIDIA Corporation

Inventor： Ruben Villegas , Alejandro Troccoli , Iuri Frosio , Stephen Tyree , Wonmin Byeon , Jan Kautz

IPC: G06N3/04 , G06N3/08 , G05D1/02

Abstract: In various examples, historical trajectory information of objects in an environment may be tracked by an ego-vehicle and encoded into a state feature. The encoded state features for each of the objects observed by the ego-vehicle may be used—e.g., by a bi-directional long short-term memory (LSTM) network—to encode a spatial feature. The encoded spatial feature and the encoded state feature for an object may be used to predict lateral and/or longitudinal maneuvers for the object, and the combination of this information may be used to determine future locations of the object. The future locations may be used by the ego-vehicle to determine a path through the environment, or may be used by a simulation system to control virtual objects—according to trajectories determined from the future locations—through a simulation environment.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification