Patent search ap:("NVIDIA CORPORATION") AND inv:"Shalini De Mello" Page 1

1.

发明授权
Training and inferencing using a neural network to predict orientations of objects in images 有权

公开(公告)号：US12266144B2

公开(公告)日：2025-04-01

申请号：US16690015

申请日：2019-11-20

Applicant: NVIDIA Corporation

Inventor： Siva Karthik Mustikovela , Varun Jampani , Shalini De Mello , Sifei Liu , Umar Iqbal , Jan Kautz

IPC: G06V10/24 , G06F18/21 , G06F18/214 , G06N3/045 , G06N3/08 , G06T7/73 , G06V10/44 , G06V10/764 , G06V10/778 , G06V10/82 , G06V20/56

Abstract: Apparatuses, systems, and techniques to identify orientations of objects within images. In at least one embodiment, one or more neural networks are trained to identify an orientations of one or more objects based, at least in part, on one or more characteristics of the object other than the object's orientation.

2.

发明授权
Self-supervised single-view 3D reconstruction via semantic consistency 有权

公开(公告)号：US12182940B2

公开(公告)日：2024-12-31

申请号：US17578051

申请日：2022-01-18

Applicant: NVIDIA Corporation

Inventor： Xueting Li , Sifei Liu , Kihwan Kim , Shalini De Mello , Varun Jampani , Jan Kautz

IPC: G06T15/00 , G06F18/21 , G06T7/40 , G06T7/73 , G06T17/20 , G06V10/26 , G06V10/776 , G06V10/82 , G06V20/64

Abstract: Apparatuses, systems, and techniques to identify a shape or camera pose of a three-dimensional object from a two-dimensional image of the object. In at least one embodiment, objects are identified in an image using one or more neural networks that have been trained on objects of a similar category and a three-dimensional mesh template.

3.

发明公开
CONVOLUTIONAL STRUCTURED STATE SPACE MODEL 审中-公开

公开(公告)号：US20240127041A1

公开(公告)日：2024-04-18

申请号：US18452714

申请日：2023-08-21

Applicant: NVIDIA Corporation

Inventor： Jimmy Smith , Wonmin Byeon , Shalini De Mello

IPC: G06N3/0464 , G06F17/16 , G06N3/049

CPC classification number: G06N3/0464 , G06F17/16 , G06N3/049

Abstract: Systems and methods are disclosed related to a convolutional structured state space model (ConvSSM), which has a tensor-structured state but a continuous-time parameterization and linear state updates. The linearity may be exploited to use parallel scans for subquadratic parallelization across the spatiotemporal sequence. The ConvSSM effectively models long-range dependencies and, when followed by a nonlinear operation forms a spatiotemporal layer (ConvS5) that does not require compressing frames into tokens, can be efficiently parallelized across the sequence, provides an unbounded context, and enables fast autoregressive generation.

4.

发明授权
Three-dimensional object reconstruction from a video 有权

公开(公告)号：US11704857B2

公开(公告)日：2023-07-18

申请号：US17734244

申请日：2022-05-02

Applicant: NVIDIA Corporation

Inventor： Xueting Li , Sifei Liu , Kihwan Kim , Shalini De Mello , Jan Kautz

IPC: G06T15/04 , G06T7/579 , G06T7/70 , G06T17/20 , G06T15/20

CPC classification number: G06T15/04 , G06T7/579 , G06T7/70 , G06T15/20 , G06T17/20 , G06T2207/10016 , G06T2207/20084 , G06T2207/30244

Abstract: A three-dimensional (3D) object reconstruction neural network system learns to predict a 3D shape representation of an object from a video that includes the object. The 3D reconstruction technique may be used for content creation, such as generation of 3D characters for games, movies, and 3D printing. When 3D characters are generated from video, the content may also include motion of the character, as predicted based on the video. The 3D object construction technique exploits temporal consistency to reconstruct a dynamic 3D representation of the object from an unlabeled video. Specifically, an object in a video has a consistent shape and consistent texture across multiple frames. Texture, base shape, and part correspondence invariance constraints may be applied to fine-tune the neural network system. The reconstruction technique generalizes well—particularly for non-rigid objects.

5.

发明公开
ESTIMATING FACIAL EXPRESSIONS USING FACIAL LANDMARKS 审中-公开

公开(公告)号：US20230144458A1

公开(公告)日：2023-05-11

申请号：US18051209

申请日：2022-10-31

Applicant: NVIDIA Corporation

Inventor： Alexander Malafeev , Shalini De Mello , Jaewoo Seo , Umar Iqbal , Koki Nagano , Jan Kautz , Simon Yuen

IPC: G06V40/16 , G06V10/82 , G06T13/40

CPC classification number: G06V40/174 , G06V40/171 , G06V40/165 , G06V10/82 , G06T13/40

Abstract: In examples, locations of facial landmarks may be applied to one or more machine learning models (MLMs) to generate output data indicating profiles corresponding to facial expressions, such as facial action coding system (FACS) values. The output data may be used to determine geometry of a model. For example, video frames depicting one or more faces may be analyzed to determine the locations. The facial landmarks may be normalized, then be applied to the MLM(s) to infer the profile(s), which may then be used to animate the mode for expression retargeting from the video. The MLM(s) may include sub-networks that each analyze a set of input data corresponding to a region of the face to determine profiles that correspond to the region. The profiles from the sub-networks, along global locations of facial landmarks may be used by a subsequent network to infer the profiles for the overall face.

6.

发明申请
SINGLE-IMAGE INVERSE RENDERING 有权

公开(公告)号：US20230081641A1

公开(公告)日：2023-03-16

申请号：US17551046

申请日：2021-12-14

Applicant: NVIDIA Corporation

Inventor： Koki Nagano , Eric Ryan Chan , Sameh Khamis , Shalini De Mello , Tero Tapani Karras , Orazio Gallo , Jonathan Tremblay

IPC: G06T15/08 , G06N3/04 , G06N3/08 , G06T17/00 , G06T17/20

Abstract: A single two-dimensional (2D) image can be used as input to obtain a three-dimensional (3D) representation of the 2D image. This is done by extracting features from the 2D image by an encoder and determining a 3D representation of the 2D image utilizing a trained 2D convolutional neural network (CNN). Volumetric rendering is then run on the 3D representation to combine features within one or more viewing directions, and the combined features are provided as input to a multilayer perceptron (MLP) that predicts and outputs color (or multi-dimensional neural features) and density values for each point within the 3D representation. As a result, single-image inverse rendering may be performed using only a single 2D image as input to create a corresponding 3D representation of the scene in the single 2D image.

7.

发明授权
Synthetic infrared image generation for machine learning of gaze estimation 有权

公开(公告)号：US11321865B1

公开(公告)日：2022-05-03

申请号：US16355481

申请日：2019-03-15

Applicant: NVIDIA CORPORATION

Inventor： Joohwan Kim , Michael Stengel , Zander Majercik , Shalini De Mello , Samuli Laine , Morgan McGuire , David Luebke

IPC: G06T7/70 , G06F3/01 , G06N3/04 , G06F7/57 , G06K9/00

Abstract: One embodiment of a method includes calculating one or more activation values of one or more neural networks trained to infer eye gaze information based, at least in part, on eye position of one or more images of one or more faces indicated by an infrared light reflection from the one or more images.

8.

发明授权
Learning affinity via a spatial propagation neural network 有权

公开(公告)号：US10762425B2

公开(公告)日：2020-09-01

申请号：US16134716

申请日：2018-09-18

Applicant: NVIDIA Corporation

Inventor： Sifei Liu , Shalini De Mello , Jinwei Gu , Ming-Hsuan Yang , Jan Kautz

IPC: G06K9/46 , G06N5/04 , G06K9/62 , G06N3/08 , G06T7/90 , G06T7/11 , G06N3/04

Abstract: A spatial linear propagation network (SLPN) system learns the affinity matrix for vision tasks. An affinity matrix is a generic matrix that defines the similarity of two points in space. The SLPN system is trained for a particular computer vision task and refines an input map (i.e., affinity matrix) that indicates pixels the share a particular property (e.g., color, object, texture, shape, etc.). Inputs to the SLPN system are input data (e.g., pixel values for an image) and the input map corresponding to the input data to be propagated. The input data is processed to produce task-specific affinity values (guidance data). The task-specific affinity values are applied to values in the input map, with at least two weighted values from each column contributing to a value in the refined map data for the adjacent column.

9.

发明申请
NEURAL HEAD AVATAR CONSTRUCTION FROM AN IMAGE 有权

公开(公告)号：US20240404174A1

公开(公告)日：2024-12-05

申请号：US18653723

申请日：2024-05-02

Applicant: NVIDIA Corporation

Inventor： Xueting Li , Shalini De Mello , Sifei Liu , Koki Nagano , Umar Iqbal , Jan Kautz

IPC: G06T15/08 , G06T13/40 , G06V10/774 , G06V10/82 , G06V10/94 , G06V40/16

Abstract: Systems and methods are disclosed that animate a source portrait image with motion (i.e., pose and expression) from a target image. In contrast to conventional systems, given an unseen single-view portrait image, an implicit three-dimensional (3D) head avatar is constructed that not only captures photo-realistic details within and beyond the face region, but also is readily available for animation without requiring further optimization during inference. In an embodiment, three processing branches of a system produce three tri-planes representing coarse 3D geometry for the head avatar, detailed appearance of a source image, as well as the expression of a target image. By applying volumetric rendering to a combination of the three tri-planes, an image of the desired identity, expression and pose is generated.

10.

发明公开
SYNTHETIC DATASET GENERATOR 审中-公开

公开(公告)号：US20240127075A1

公开(公告)日：2024-04-18

申请号：US18212629

申请日：2023-06-21

Applicant: NVIDIA Corporation

Inventor： Shalini De Mello , Christian Jacobsen , Xunlei Wu , Stephen Tyree , Alice Li , Wonmin Byeon , Shangru Li

IPC: G06N3/0985

CPC classification number: G06N3/0985

Abstract: Machine learning is a process that learns a model from a given dataset, where the model can then be used to make a prediction about new data. In order to reduce the costs associated with collecting and labeling real world datasets for use in training the model, computer processes can synthetically generate datasets which simulate real world data. The present disclosure improves the effectiveness of such synthetic datasets for training machine learning models used in real world applications, in particular by generating a synthetic dataset that is specifically targeted to a specified downstream task (e.g. a particular computer vision task, a particular natural language processing task, etc.).

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification