Patent search ap:("Nvidia Corporation") AND inv:"Jose Manuel Alvarez Lopez" Page 2

11.

发明授权
Scalable semantic image retrieval with deep template matching 有权

公开(公告)号：US12272148B2

公开(公告)日：2025-04-08

申请号：US17226584

申请日：2021-04-09

Applicant: Nvidia Corporation

Inventor： Donna Roy , Suraj Kothawade , Elmar Haussmann , Jose Manuel Alvarez Lopez , Michele Fenzi , Christoph Angerer

IPC: G06V20/56 , G06F18/2113 , G06F18/214 , G06F18/22 , G06N3/08 , G06V30/262

Abstract: Approaches presented herein provide for semantic data matching, as may be useful for selecting data from a large unlabeled dataset to train a neural network. For an object detection use case, such a process can identify images within an unlabeled set even when an object of interest represents a relatively small portion of an image or there are many other objects in the image. A query image can be processed to extract image features or feature maps from only one or more regions of interest in that image, as may correspond to objects of interest. These features are compared with images in an unlabeled dataset, with similarity scores being calculated between the features of the region(s) of interest and individual images in the unlabeled set. One or more highest scored images can be selected as training images showing objects that are semantically similar to the object in the query image.

12.

发明公开
GENERATING GLOBAL HIERARCHICAL SELF-ATTENTION 审中-公开

公开(公告)号：US20240185034A1

公开(公告)日：2024-06-06

申请号：US18130648

申请日：2023-04-04

Applicant: NVIDIA Corporation

Inventor： Ali Hatamizadeh , Gregory Heinrich , Hongxu Yin , Jose Manuel Alvarez Lopez , Jan Kautz , Pavlo Molchanov

IPC: G06N3/0455 , G06N3/0464 , G06N3/08

CPC classification number: G06N3/0455 , G06N3/0464 , G06N3/08

Abstract: Apparatuses, systems, and techniques of using one or more machine learning processes (e.g., neural network(s)) to process data (e.g., using hierarchical self-attention). In at least one embodiment, image data is classified using hierarchical self-attention generated using carrier tokens that are associated with windowed subregions of the image data, and local attention generated using local tokens within the windowed subregions and the carrier tokens.

13.

发明申请
NEURAL NETWORK TRAINING TECHNIQUE 有权

公开(公告)号：US20220284283A1

公开(公告)日：2022-09-08

申请号：US17195451

申请日：2021-03-08

Applicant: NVIDIA Corporation

Inventor： Hongxu Yin , Pavlo Molchanov , Jose Manuel Alvarez Lopez , Xin Dong

IPC: G06N3/08 , G06N3/04

Abstract: Apparatuses, systems, and techniques to invert a neural network. In at least one embodiment, one or more neural network layers are inverted and, in at least one embodiment, loaded in reverse order.

14.

发明申请
TECHNIQUES TO IDENTIFY DATA USED TO TRAIN ONE OR MORE NEURAL NETWORKS 有权

公开(公告)号：US20220284232A1

公开(公告)日：2022-09-08

申请号：US17188397

申请日：2021-03-01

Applicant: NVIDIA Corporation

Inventor： Hongxu Yin , Arun Mallya , Arash Vahdat , Jose Manuel Alvarez Lopez , Jan Kautz , Pavlo Molchanov

IPC: G06K9/62 , G06K9/66

Abstract: Apparatuses, systems, and techniques to identify one or more images used to train one or more neural networks. In at least one embodiment, one or more images used to train one or more neural networks are identified, based on, for example, one or more labels of one or more objects within the one or more images.

15.

发明申请
BI-DIRECTIONAL FEATURE PROJECTION FOR 3D PERCEPTION SYSTEMS AND APPLICATIONS 有权

公开(公告)号：US20240378799A1

公开(公告)日：2024-11-14

申请号：US18642531

申请日：2024-04-22

Applicant: NVIDIA Corporation

Inventor： Zhiqi Li , Zhiding Yu , Animashree Anandkumar , Jose Manuel Alvarez Lopez

IPC: G06T15/20 , G06T7/11 , G06T7/50 , G06V20/58

Abstract: In various examples, bi-directional projection techniques may be used to generate enhanced Bird's-Eye View (BEV) representations. For example, a system(s) may generate one or more BEV features associated with a BEV of an environment using a projection process that associates 2D image features to one or more first locations of a 3D space. At least partially using the BEV feature(s), the system(s) may determine one or more second locations of the 3D space that correspond to one or more regions of interest in the environment. The system(s) may then generate one or more additional BEV features corresponding to the second location(s) using a different projection process that associates the second location(s) from the 3D space to at least a portion of the 2D image features. The system(s) may then generate an updated BEV of the environment based at least on the BEV feature(s) and/or the additional BEV feature(s).

16.

发明公开
SYNTHETIC DATA GENERATION USING VIEWPOINT AUGMENTATION FOR AUTONOMOUS SYSTEMS AND APPLICATIONS 审中-公开

公开(公告)号：US20240362897A1

公开(公告)日：2024-10-31

申请号：US18634134

申请日：2024-04-12

Applicant: NVIDIA Corporation

Inventor： Tzofi Klinghoffer , Jonah Philion , Zan Gojcic , Sanja Fidler , Or Litany , Wenzheng Chen , Jose Manuel Alvarez Lopez

IPC: G06V10/774 , G06T7/55 , G06T15/20

CPC classification number: G06V10/774 , G06T7/55 , G06T15/205 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , G06T2207/30181 , G06T2207/30252

Abstract: In various examples, systems and methods are disclosed relating to synthetic data generation using viewpoint augmentation for autonomous and semi-autonomous systems and applications. One or more circuits can identify a set of sequential images corresponding to a first viewpoint and generate a first transformed image corresponding to a second viewpoint using a first image of the set of sequential images as input to a machine-learning model. The one or more circuits can update the machine-learning model based at least on a loss determined according to the first transformed image and a second image of the set of sequential images.

17.

发明公开
CLASS AGNOSTIC OBJECT MASK GENERATION 审中-公开

公开(公告)号：US20240169545A1

公开(公告)日：2024-05-23

申请号：US18355856

申请日：2023-07-20

Applicant: NVIDIA Corporation

Inventor： Shiyi Lan , Zhiding Yu , Subhashree Radhakrishnan , Jose Manuel Alvarez Lopez , Animashree Anandkumar

IPC: G06T7/11 , G06T1/20

CPC classification number: G06T7/11 , G06T1/20 , G06T2207/20081 , G06T2207/20084 , G06T2207/20132

Abstract: Class agnostic object mask generation uses a vision transformer-based auto-labeling framework requiring only images and object bounding boxes to generate object (segmentation) masks. The generated object masks, images, and object labels may then be used to train instance segmentation models or other neural networks to localize and segment objects with pixel-level accuracy. The generated object masks may supplement or replace conventional human generated annotations. The human generated annotations may be misaligned compared with the object boundaries, resulting in poor quality labeled segmentation masks. In contrast with conventional techniques, the generated object masks are class agnostic and are automatically generated based only on a bounding box image region without relying on either labels or semantic information.

18.

发明公开
SPARSE VOXEL TRANSFORMER FOR CAMERA-BASED 3D SEMANTIC SCENE COMPLETION 审中-公开

公开(公告)号：US20240087222A1

公开(公告)日：2024-03-14

申请号：US18515016

申请日：2023-11-20

Applicant: NVIDIA Corporation

Inventor： Yiming Li , Zhiding Yu , Christopher B. Choy , Chaowei Xiao , Jose Manuel Alvarez Lopez , Sanja Fidler , Animashree Anandkumar

IPC: G06T17/00 , B60W50/14 , G06T3/40 , G06V10/44 , G06V10/771 , G06V10/82

CPC classification number: G06T17/00 , B60W50/14 , G06T3/40 , G06V10/44 , G06V10/771 , G06V10/82

Abstract: An artificial intelligence framework is described that incorporates a number of neural networks and a number of transformers for converting a two-dimensional image into three-dimensional semantic information. Neural networks convert one or more images into a set of image feature maps, depth information associated with the one or more images, and query proposals based on the depth information. A first transformer implements a cross-attention mechanism to process the set of image feature maps in accordance with the query proposals. The output of the first transformer is combined with a mask token to generate initial voxel features of the scene. A second transformer implements a self-attention mechanism to convert the initial voxel features into refined voxel features, which are up-sampled and processed by a lightweight neural network to generate the three-dimensional semantic information, which may be used by, e.g., an autonomous vehicle for various advanced driver assistance system (ADAS) functions.

19.

发明公开
ESTIMATING OPTIMAL TRAINING DATA SET SIZE FOR MACHINE LEARNING MODEL SYSTEMS AND APPLICATIONS 审中-公开

公开(公告)号：US20230385687A1

公开(公告)日：2023-11-30

申请号：US17828663

申请日：2022-05-31

Applicant: NVIDIA Corporation

Inventor： Rafid Reza Mahmood , James Robert Lucas , David Jesus Acuna Marrero , Daiqing Li , Jonah Philion , Jose Manuel Alvarez Lopez , Zhiding Yu , Sanja Fidler , Marc Law

IPC: G06N20/00 , G06K9/62

CPC classification number: G06N20/00 , G06K9/6265

Abstract: Approaches for training data set size estimation for machine learning model systems and applications are described. Examples include a machine learning model training system that estimates target data requirements for training a machine learning model, given an approximate relationship between training data set size and model performance using one or more validation score estimation functions. To derive a validation score estimation function, a regression data set is generated from training data, and subsets of the regression data set are used to train the machine learning model. A validation score is computed for the subsets and used to compute regression function parameters to curve fit the selected regression function to the training data set. The validation score estimation function is then solved for and provides an output of an estimate of the number additional training samples needed for the validation score estimation function to meet or exceed a target validation score.

20.

发明公开
ROBUST VISION TRANSFORMERS 审中-公开

公开(公告)号：US20230290135A1

公开(公告)日：2023-09-14

申请号：US18119770

申请日：2023-03-09

Applicant: NVIDIA Corporation

Inventor： Daquan Zhou , Zhiding Yu , Enze Xie , Anima Anandkumar , Chaowei Xiao , Jose Manuel Alvarez Lopez

IPC: G06V10/82 , G06V10/77 , G06V10/778 , G06V10/30

CPC classification number: G06V10/82 , G06V10/7715 , G06V10/778 , G06V10/30

Abstract: Apparatuses, systems, and techniques to generate a robust representation of an image. In at least one embodiment, input tokens of an input image are received, and an inference about the input image is generated based on a vision transformer (ViT) system comprising at least one self-attention module to perform token mixing and a channel self-attention module to perform channel processing.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification