Patent search ap:("NVIDIA Corporation") AND inv:"Jan Kautz" Page 2

11.

发明授权
Inverse rendering of a scene from a single image 有权

公开(公告)号：US11295514B2

公开(公告)日：2022-04-05

申请号：US16685538

申请日：2019-11-15

Applicant: NVIDIA Corporation

Inventor： Jinwei Gu , Kihwan Kim , Jan Kautz , Guilin Liu , Soumyadip Sengupta

IPC: G06T15/50 , G06T9/00 , G06N3/08 , G06N3/04

Abstract: Inverse rendering estimates physical scene attributes (e.g., reflectance, geometry, and lighting) from image(s) and is used for gaming, virtual reality, augmented reality, and robotics. An inverse rendering network (IRN) receives a single input image of a 3D scene and generates the physical scene attributes for the image. The IRN is trained by using the estimated physical scene attributes generated by the IRN to reproduce the input image and updating parameters of the IRN to reduce differences between the reproduced input image and the input image. A direct renderer and a residual appearance renderer (RAR) reproduce the input image. The RAR predicts a residual image representing complex appearance effects of the real (not synthetic) image based on features extracted from the image and the reflectance and geometry properties. The residual image represents near-field illumination, cast shadows, inter-reflections, and realistic shading that are not provided by the direct renderer.

12.

发明申请
DISTANCE DETERMINATIONS USING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20210326694A1

公开(公告)日：2021-10-21

申请号：US16852944

申请日：2020-04-20

Applicant: Nvidia Corporation

Inventor： Jialiang Wang , Varun Jampani , Stan Birchfield , Charles Loop , Jan Kautz

IPC: G06N3/08 , G06N3/04

Abstract: Apparatuses, systems, and techniques are presented to determine distance for one or more objects. In at least one embodiment, a disparity network is trained to determine distance data from input stereoscopic images using a loss function that includes at least one of a gradient loss term and an occlusion loss term.

13.

发明授权
Transforming convolutional neural networks for visual sequence learning 有权

公开(公告)号：US11049018B2

公开(公告)日：2021-06-29

申请号：US15880472

申请日：2018-01-25

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Pavlo Molchanov , Jan Kautz

IPC: G06N3/08 , G06K9/00 , G06N3/04 , G06K9/62

Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.

14.

发明授权
Iterative spatio-temporal action detection in video 有权

公开(公告)号：US11017556B2

公开(公告)日：2021-05-25

申请号：US16152303

申请日：2018-10-04

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Xitong Yang , Fanyi Xiao , Ming-Yu Liu , Jan Kautz

IPC: G06T7/73 , G06K9/00 , G06T7/277

Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.

15.

发明申请
LEARNING RIGIDITY OF DYNAMIC SCENES FOR THREE-DIMENSIONAL SCENE FLOW ESTIMATION 有权

公开(公告)号：US20210150736A1

公开(公告)日：2021-05-20

申请号：US17156406

申请日：2021-01-22

Applicant: NVIDIA Corporation

Inventor： Zhaoyang Lv , Kihwan Kim , Deqing Sun , Alejandro Jose Troccoli , Jan Kautz

IPC: G06T7/254 , G06T7/90 , G06T7/50 , G06N3/08 , G06T7/194 , G06T3/00 , G06T7/70 , G06T7/60 , G06T7/11 , G06N5/04 , G06T7/285 , G06T7/215

Abstract: A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field representing the dynamic (non-rigid) part of the scene. The two components are information identifying dynamic and static portions of each image and the camera orientation. The dynamic portions of each image contain motion in the 3D space that is independent of the camera orientation. In other words, the motion in the 3D space (estimated 3D scene flow data) is separated from the motion of the camera.

16.

发明申请
SYNTHESIZING DATA FOR TRAINING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20210142177A1

公开(公告)日：2021-05-13

申请号：US16682967

申请日：2019-11-13

Applicant: Nvidia Corporation

Inventor： Arun Mallya , Jan Kautz , Zhizhong Li , Pavlo Molchanov , Hongxu Danny Yin

IPC: G06N3/08 , G06N3/04

Abstract: Apparatuses, systems, and techniques are presented to generate data useful for further training of a neural network. In at least one embodiment, one or more neural networks can be re-trained based, at least in part, on data generated by the one or more neural networks including data used to previously train the one or more neural networks.

17.

发明申请
THREE-DIMENSIONAL (3D) POSE ESTIMATION FROM A MONOCULAR CAMERA 有权

公开(公告)号：US20210117661A1

公开(公告)日：2021-04-22

申请号：US17135697

申请日：2020-12-28

Applicant: NVIDIA Corporation

Inventor： Umar Iqbal , Pavlo Molchanov , Thomas Michael Breuel , Jan Kautz

IPC: G06K9/00 , G06N3/08 , G06T7/73 , G06N5/04 , G06T7/579

Abstract: Estimating a three-dimensional (3D) pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is necessary for human-computer interaction. A hand pose can be represented by a set of points in 3D space, called keypoints. Two coordinates (x,y) represent spatial displacement and a third coordinate represents a depth of every point with respect to the camera. A monocular camera is used to capture an image of the 3D pose, but does not capture depth information. A neural network architecture is configured to generate a depth value for each keypoint in the captured image, even when portions of the pose are occluded, or the orientation of the object is ambiguous. Generation of the depth values enables estimation of the 3D pose of the object.

18.

发明授权
Learning-based camera pose estimation from images of an environment 有权

公开(公告)号：US10964061B2

公开(公告)日：2021-03-30

申请号：US16872752

申请日：2020-05-12

Applicant: NVIDIA Corporation

Inventor： Jinwei Gu , Samarth Manoj Brahmbhatt , Kihwan Kim , Jan Kautz

IPC: G06T7/80 , G06T7/00 , G06K9/00 , G06K9/20 , G06K9/46 , G06N3/00 , G06T7/579 , G06T7/20

Abstract: A deep neural network (DNN) system learns a map representation for estimating a camera position and orientation (pose). The DNN is trained to learn a map representation corresponding to the environment, defining positions and attributes of structures, trees, walls, vehicles, etc. The DNN system learns a map representation that is versatile and performs well for many different environments (indoor, outdoor, natural, synthetic, etc.). The DNN system receives images of an environment captured by a camera (observations) and outputs an estimated camera pose within the environment. The estimated camera pose is used to perform camera localization, i.e., recover the three-dimensional (3D) position and orientation of a moving camera, which is a fundamental task in computer vision with a wide variety of applications in robot navigation, car localization for autonomous driving, device localization for mobile navigation, and augmented/virtual reality.

19.

发明申请
DUAL RECURRENT NEURAL NETWORK ARCHITECTURE FOR MODELING LONG-TERM DEPENDENCIES IN SEQUENTIAL DATA 有权

公开(公告)号：US20210089867A1

公开(公告)日：2021-03-25

申请号：US16581099

申请日：2019-09-24

Applicant: NVIDIA Corporation

Inventor： Wonmin Byeon , Jan Kautz

IPC: G06N3/04 , G06N3/08

Abstract: Learning the dynamics of an environment and predicting consequences in the future is a recent technical advancement that can be applied to video prediction, speech recognition, among other applications. Generally, machine learning, such as deep learning models, neural networks, or other artificial intelligence algorithms are used to make the predictions. However, current artificial intelligence algorithms used for making predictions are typically limited to making short-term future predictions, mainly as a result of 1) the presence of complex dynamics in high-dimensional video data, 2) prediction error propagation over time, and 3) inherent uncertainty of the future. The present disclosure enables the modeling of long-term dependencies in sequential data for use in making long-term predictions by providing a dual (i.e. two-part) recurrent neural network architecture.

20.

发明授权
Equivariant landmark transformation for landmark localization 有权

公开(公告)号：US10783394B2

公开(公告)日：2020-09-22

申请号：US16006728

申请日：2018-06-12

Applicant: NVIDIA Corporation

Inventor： Pavlo Molchanov , Stephen Walter Tyree , Jan Kautz , Sina Honari

IPC: G06K9/00 , G06K9/62 , G06K9/46 , G06N3/08 , G06K9/66 , G06N3/04

Abstract: A method, computer readable medium, and system are disclosed to generate coordinates of landmarks within images. The landmark locations may be identified on an image of a human face and used for emotion recognition, face identity verification, eye gaze tracking, pose estimation, etc. A transform is applied to input image data to produce transformed input image data. The transform is also applied to predicted coordinates for landmarks of the input image data to produce transformed predicted coordinates. A neural network model processes the transformed input image data to generate additional landmarks of the transformed input image data and additional predicted coordinates for each one of the additional landmarks. Parameters of the neural network model are updated to reduce differences between the transformed predicted coordinates and the additional predicted coordinates.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification