Patent search ap:("Nvidia Corporation") AND inv:"Jan Kautz" Page 8

71.

发明授权
Transforming convolutional neural networks for visual sequence learning 有权

公开(公告)号：US11645530B2

公开(公告)日：2023-05-09

申请号：US17325024

申请日：2021-05-19

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Pavlo Molchanov , Jan Kautz

IPC: G06N3/082 , G06V20/40 , G06V10/764 , G06V10/82 , G06F18/24 , G06N3/044 , G06N3/045 , G06N3/048

CPC classification number: G06N3/082 , G06F18/24 , G06N3/044 , G06N3/045 , G06N3/048 , G06V10/764 , G06V10/82 , G06V20/41

Abstract: A method, computer readable medium, and system are disclosed for visual sequence learning using neural networks. The method includes the steps of replacing a non-recurrent layer within a trained convolutional neural network model with a recurrent layer to produce a visual sequence learning neural network model and transforming feedforward weights for the non-recurrent layer into input-to-hidden weights of the recurrent layer to produce a transformed recurrent layer. The method also includes the steps of setting hidden-to-hidden weights of the recurrent layer to initial values and processing video image data by the visual sequence learning neural network model to generate classification or regression output data.

72.

发明授权
Iterative spatio-temporal action detection in video 有权

公开(公告)号：US11631239B2

公开(公告)日：2023-04-18

申请号：US17237728

申请日：2021-04-22

Applicant: NVIDIA Corporation

Inventor： Xiaodong Yang , Ming-Yu Liu , Jan Kautz , Fanyi Xiao , Xitong Yang

IPC: G06T7/73 , G06V10/82 , G06T7/277 , G06V40/20 , G06V10/25 , G06V10/764

Abstract: Iterative prediction systems and methods for the task of action detection process an inputted sequence of video frames to generate an output of both action tubes and respective action labels, wherein the action tubes comprise a sequence of bounding boxes on each video frame. An iterative predictor processes large offsets between the bounding boxes and the ground-truth.

73.

发明申请
PERFORMING OCCLUSION-AWARE GLOBAL 3D POSE AND SHAPE ESTIMATION OF ARTICULATED OBJECTS 有权

公开(公告)号：US20230070514A1

公开(公告)日：2023-03-09

申请号：US17584213

申请日：2022-01-25

Applicant: NVIDIA Corporation

Inventor： Ye Yuan , Umar Iqbal , Pavlo Molchanov , Jan Kautz

IPC: G06T19/20 , G06T7/20 , G06T7/00

Abstract: In order to determine accurate three-dimensional (3D) models for objects within a video, the objects are first identified and tracked within the video, and a pose and shape are estimated for these tracked objects. A translation and global orientation are removed from the tracked objects to determine local motion for the objects, and motion infilling is performed to fill in any missing portions for the object within the video. A global trajectory is then determined for the objects within the video, and the infilled motion and global trajectory are then used to determine infilled global motion for the object within the video. This enables the accurate depiction of each object as a 3D pose sequence for that model that accounts for occlusions and global factors within the video.

74.

发明授权
Few-shot training of a neural network 有权

公开(公告)号：US11593661B2

公开(公告)日：2023-02-28

申请号：US16389832

申请日：2019-04-19

Applicant: NVIDIA Corporation

Inventor： Seonwook Park , Shalini De Mello , Pavlo Molchanov , Umar Iqbal , Jan Kautz

IPC: G06N3/08 , G06F7/57 , G06F17/18 , G06N3/04 , G06N3/088

Abstract: A neural network is trained to identify one or more features of an image. The neural network is trained using a small number of original images, from which a plurality of additional images are derived. The additional images generated by rotating and decoding embeddings of the image in a latent space generated by an autoencoder. The images generated by the rotation and decoding exhibit changes to a feature that is in proportion to the amount of rotation.

75.

发明申请
SYNTHESIZING VIDEO FROM AUDIO USING ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20230035306A1

公开(公告)日：2023-02-02

申请号：US17382027

申请日：2021-07-21

Applicant: Nvidia Corporation

Inventor： Ming-Yu Liu , Koki Nagano , Yeongho Seol , Jose Rafael Valle Gomes da Costa , Jaewoo Seo , Ting-Chun Wang , Arun Mallya , Sameh Khamis , Wei Ping , Rohan Badlani , Kevin Jonathan Shih , Bryan Catanzaro , Simon Yuen , Jan Kautz

IPC: G06T13/40 , H04N19/597 , G06N3/04 , G10L13/04 , G06T9/00 , G06T17/10 , G06T13/20

Abstract: Apparatuses, systems, and techniques are presented to generate media content. In at least one embodiment, a first neural network is used to generate first video information based, at least in part, upon voice information corresponding to one or more users, and a second neural network is used to generate second video information corresponding to the one or more users based, at least in part, upon the first video information and one or more images corresponding to the one or more users

76.

发明申请
IMAGE PROCESSING USING COUPLED SEGMENTATION AND EDGE LEARNING 有权

公开(公告)号：US20230015989A1

公开(公告)日：2023-01-19

申请号：US17365877

申请日：2021-07-01

Applicant: Nvidia Corporation

Inventor： Zhiding Yu , Rui Huang , Wonmin Byeon , Sifei Liu , Guilin Liu , Thomas Breuel , Anima Anandkumar , Jan Kautz

IPC: G06K9/46 , G06N3/04 , G06T7/13 , G06K9/62

Abstract: The disclosure provides a learning framework that unifies both semantic segmentation and semantic edge detection. A learnable recurrent message passing layer is disclosed where semantic edges are considered as explicitly learned gating signals to refine segmentation and improve dense prediction quality by finding compact structures for message paths. The disclosure includes a method for coupled segmentation and edge learning. In one example, the method includes: (1) receiving an input image, (2) generating, from the input image, a semantic feature map, an affinity map, and a semantic edge map from a single backbone network of a convolutional neural network (CNN), and (3) producing a refined semantic feature map by smoothing pixels of the semantic feature map using spatial propagation, and controlling the smoothing using both affinity values from the affinity map and edge values from the semantic edge map.

77.

发明申请
TRAINING OBJECT DETECTION SYSTEMS WITH GENERATED IMAGES 有权

公开(公告)号：US20230004760A1

公开(公告)日：2023-01-05

申请号：US17361202

申请日：2021-06-28

Applicant: NVIDIA Corporation

Inventor： Siva Karthik Mustikovela , Shalini De Mello , Aayush Prakash , Umar Iqbal , Sifei Liu , Jan Kautz

IPC: G06K9/62

Abstract: Apparatuses, systems, and techniques to identify objects within an image using self-supervised machine learning. In at least one embodiment, a machine learning system is trained to recognize objects by training a first network to recognize objects within images that are generated by a second network. In at least one embodiment, the second network is a controllable network.

78.

发明授权
Learning rigidity of dynamic scenes for three-dimensional scene flow estimation 有权

公开(公告)号：US11508076B2

公开(公告)日：2022-11-22

申请号：US17156406

申请日：2021-01-22

Applicant: NVIDIA Corporation

Inventor： Zhaoyang Lv , Kihwan Kim , Deqing Sun , Alejandro Jose Troccoli , Jan Kautz

IPC: G06T7/254 , G06T7/90 , G06T7/50 , G06N3/08 , G06T7/194 , G06T3/00 , G06T7/70 , G06T7/60 , G06T7/11 , G06N5/04 , G06T7/285 , G06T7/215

Abstract: A neural network model receives color data for a sequence of images corresponding to a dynamic scene in three-dimensional (3D) space. Motion of objects in the image sequence results from a combination of a dynamic camera orientation and motion or a change in the shape of an object in the 3D space. The neural network model generates two components that are used to produce a 3D motion field representing the dynamic (non-rigid) part of the scene. The two components are information identifying dynamic and static portions of each image and the camera orientation. The dynamic portions of each image contain motion in the 3D space that is independent of the camera orientation. In other words, the motion in the 3D space (estimated 3D scene flow data) is separated from the motion of the camera.

79.

发明授权
Three-dimensional (3D) pose estimation from a monocular camera 有权

公开(公告)号：US11488418B2

公开(公告)日：2022-11-01

申请号：US17135697

申请日：2020-12-28

Applicant: NVIDIA Corporation

Inventor： Umar Iqbal , Pavlo Molchanov , Thomas Michael Breuel , Jan Kautz

IPC: G06V40/20 , G06N3/08 , G06T7/73 , G06N5/04 , G06T7/579 , G06V40/10

Abstract: Estimating a three-dimensional (3D) pose of an object, such as a hand or body (human, animal, robot, etc.), from a 2D image is necessary for human-computer interaction. A hand pose can be represented by a set of points in 3D space, called keypoints. Two coordinates (x,y) represent spatial displacement and a third coordinate represents a depth of every point with respect to the camera. A monocular camera is used to capture an image of the 3D pose, but does not capture depth information. A neural network architecture is configured to generate a depth value for each keypoint in the captured image, even when portions of the pose are occluded, or the orientation of the object is ambiguous. Generation of the depth values enables estimation of the 3D pose of the object.

80.

发明申请
THREE-DIMENSIONAL OBJECT RECONSTRUCTION FROM A VIDEO 有权

公开(公告)号：US20220270318A1

公开(公告)日：2022-08-25

申请号：US17734244

申请日：2022-05-02

Applicant: NVIDIA Corporation

Inventor： Xueting Li , Sifei Liu , Kihwan Kim , Shalini De Mello , Jan Kautz

IPC: G06T15/04 , G06T7/579 , G06T7/70 , G06T17/20 , G06T15/20

Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification