-
公开(公告)号:US20240029836A1
公开(公告)日:2024-01-25
申请号:US18353773
申请日:2023-07-17
Applicant: NVIDIA Corporation
Inventor: Weili Nie , Zichao Wang , Chaowei Xiao , Animashree Anandkumar
Abstract: A machine learning framework is described for performing generation of candidate molecules for, e.g., drug discovery or other applications. The framework utilizes a pre-trained encoder-decoder model to interface between representations of molecules and embeddings for those molecules in a latent space. A fusion module is located between the encoder and decoder and is used to fuse an embedding for an input molecule with embeddings for one or more exemplary molecules selected from a database that is constructed according to a design criteria. The fused embedding is decoded using the decoder to generate a candidate molecule. The fusion module is trained to reconstruct a nearest neighbor to the input molecule from the database based on the sample of exemplary molecules. An iterative approach may be used during inference to dynamically update the database to include newly generated candidate molecules.
-
公开(公告)号:US12159694B2
公开(公告)日:2024-12-03
申请号:US18353773
申请日:2023-07-17
Applicant: NVIDIA Corporation
Inventor: Weili Nie , Zichao Wang , Chaowei Xiao , Animashree Anandkumar
IPC: G16C20/00 , G06N5/04 , G06N7/01 , G06N20/00 , G06N20/10 , G16C20/10 , G16C20/30 , G16C20/70 , G16C20/90
Abstract: A machine learning framework is described for performing generation of candidate molecules for, e.g., drug discovery or other applications. The framework utilizes a pre-trained encoder-decoder model to interface between representations of molecules and embeddings for those molecules in a latent space. A fusion module is located between the encoder and decoder and is used to fuse an embedding for an input molecule with embeddings for one or more exemplary molecules selected from a database that is constructed according to a design criteria. The fused embedding is decoded using the decoder to generate a candidate molecule. The fusion module is trained to reconstruct a nearest neighbor to the input molecule from the database based on the sample of exemplary molecules. An iterative approach may be used during inference to dynamically update the database to include newly generated candidate molecules.
-
3.
公开(公告)号:US20240028673A1
公开(公告)日:2024-01-25
申请号:US18180476
申请日:2023-03-08
Applicant: NVIDIA Corporation
Inventor: Chaowei Xiao , Yolong Cao , Danfei Xu , Animashree Anandkumar , Marco Pavone , Xinshuo Weng
CPC classification number: G06F21/14 , B60W60/0011
Abstract: In various examples, robust trajectory predictions against adversarial attacks in autonomous machines and applications are described herein. Systems and methods are disclosed that perform adversarial training for trajectory predictions determined using a neural network(s). In order to improve the training, the systems and methods may devise a deterministic attach that creates a deterministic gradient path within a probabilistic model to generate adversarial samples for training. Additionally, the systems and methods may introduce a hybrid objective that interleaves the adversarial training and learning from clean data to anchor the output from the neural network(s) on stable, clean data distribution. Furthermore, the systems and methods may use a domain-specific data augmentation technique that generates diverse, realistic, and dynamically-feasible samples for additional training of the neural network(s).
-
公开(公告)号:US20230290057A1
公开(公告)日:2023-09-14
申请号:US17691723
申请日:2022-03-10
Applicant: NVIDIA Corporation
Inventor: Yuke Zhu , Bokui Shen , Christopher Bongsoo Choy , Animashree Anandkumar
CPC classification number: G06T17/10 , G06N20/20 , G06T19/20 , G06T2219/2021
Abstract: One or more machine learning models (MLMs) may learn implicit 3D representations of geometry of an object and of dynamics of the object from performing an action on the object. Implicit neural representations may be used to reconstruct high-fidelity full geometry of the object and predict a flow-based dynamics field from one or more images, which may provide a partial view of the object. Correspondences between locations of an object may be learned based at least on distances between the locations on a surface corresponding to the object, such as geodesic distances. The distances may be incorporated into a contrastive learning loss function to train one or more MLMs to learn correspondences between locations of the object, such as a correspondence embedding field. The correspondences may be used to evaluate state changes when evaluating one or more actions that may be performed on the object.
-
公开(公告)号:US20240412491A1
公开(公告)日:2024-12-12
申请号:US18207953
申请日:2023-06-09
Applicant: NVIDIA Corporation
Inventor: Shagan Sah , Nishant Puri , Yuzhuo Ren , Rajath Bellipady Shetty , Weili Nie , Arash Vahdat , Animashree Anandkumar
IPC: G06V10/776 , G06N3/094 , G06T11/00 , G06V10/75 , G06V10/774 , G06V10/82 , G06V40/16
Abstract: Apparatuses, system, and techniques use one or more first neural networks to generate one or more synthetic data to train one or more second neural networks based, at least in part, on one or more performance metrics of one or more second neural networks.
-
公开(公告)号:US12165258B2
公开(公告)日:2024-12-10
申请号:US17691723
申请日:2022-03-10
Applicant: NVIDIA Corporation
Inventor: Yuke Zhu , Bokui Shen , Christopher Bongsoo Choy , Animashree Anandkumar
Abstract: One or more machine learning models (MLMs) may learn implicit 3D representations of geometry of an object and of dynamics of the object from performing an action on the object. Implicit neural representations may be used to reconstruct high-fidelity full geometry of the object and predict a flow-based dynamics field from one or more images, which may provide a partial view of the object. Correspondences between locations of an object may be learned based at least on distances between the locations on a surface corresponding to the object, such as geodesic distances. The distances may be incorporated into a contrastive learning loss function to train one or more MLMs to learn correspondences between locations of the object, such as a correspondence embedding field. The correspondences may be used to evaluate state changes when evaluating one or more actions that may be performed on the object.
-
公开(公告)号:US20240378799A1
公开(公告)日:2024-11-14
申请号:US18642531
申请日:2024-04-22
Applicant: NVIDIA Corporation
Inventor: Zhiqi Li , Zhiding Yu , Animashree Anandkumar , Jose Manuel Alvarez Lopez
Abstract: In various examples, bi-directional projection techniques may be used to generate enhanced Bird's-Eye View (BEV) representations. For example, a system(s) may generate one or more BEV features associated with a BEV of an environment using a projection process that associates 2D image features to one or more first locations of a 3D space. At least partially using the BEV feature(s), the system(s) may determine one or more second locations of the 3D space that correspond to one or more regions of interest in the environment. The system(s) may then generate one or more additional BEV features corresponding to the second location(s) using a different projection process that associates the second location(s) from the 3D space to at least a portion of the 2D image features. The system(s) may then generate an updated BEV of the environment based at least on the BEV feature(s) and/or the additional BEV feature(s).
-
公开(公告)号:US20240169545A1
公开(公告)日:2024-05-23
申请号:US18355856
申请日:2023-07-20
Applicant: NVIDIA Corporation
Inventor: Shiyi Lan , Zhiding Yu , Subhashree Radhakrishnan , Jose Manuel Alvarez Lopez , Animashree Anandkumar
CPC classification number: G06T7/11 , G06T1/20 , G06T2207/20081 , G06T2207/20084 , G06T2207/20132
Abstract: Class agnostic object mask generation uses a vision transformer-based auto-labeling framework requiring only images and object bounding boxes to generate object (segmentation) masks. The generated object masks, images, and object labels may then be used to train instance segmentation models or other neural networks to localize and segment objects with pixel-level accuracy. The generated object masks may supplement or replace conventional human generated annotations. The human generated annotations may be misaligned compared with the object boundaries, resulting in poor quality labeled segmentation masks. In contrast with conventional techniques, the generated object masks are class agnostic and are automatically generated based only on a bounding box image region without relying on either labels or semantic information.
-
公开(公告)号:US20240087222A1
公开(公告)日:2024-03-14
申请号:US18515016
申请日:2023-11-20
Applicant: NVIDIA Corporation
Inventor: Yiming Li , Zhiding Yu , Christopher B. Choy , Chaowei Xiao , Jose Manuel Alvarez Lopez , Sanja Fidler , Animashree Anandkumar
Abstract: An artificial intelligence framework is described that incorporates a number of neural networks and a number of transformers for converting a two-dimensional image into three-dimensional semantic information. Neural networks convert one or more images into a set of image feature maps, depth information associated with the one or more images, and query proposals based on the depth information. A first transformer implements a cross-attention mechanism to process the set of image feature maps in accordance with the query proposals. The output of the first transformer is combined with a mask token to generate initial voxel features of the scene. A second transformer implements a self-attention mechanism to convert the initial voxel features into refined voxel features, which are up-sampled and processed by a lightweight neural network to generate the three-dimensional semantic information, which may be used by, e.g., an autonomous vehicle for various advanced driver assistance system (ADAS) functions.
-
公开(公告)号:US20240265690A1
公开(公告)日:2024-08-08
申请号:US18544840
申请日:2023-12-19
Applicant: NVIDIA Corporation
Inventor: Animashree Anandkumar , Linxi Fan , Zhiding Yu , Chaowei Xiao , Shikun Liu
CPC classification number: G06V10/82 , G06V10/811
Abstract: A vision-language model learns skills and domain knowledge via distinct and separate task-specific neural networks, referred to as experts. Each expert is independently optimized for a specific task, facilitating the use of domain-specific data and architectures that are not feasible with a single large neural network trained for multiple tasks. The vision-language model implemented as an ensemble of pre-trained experts and is more efficiently trained compared with the single large neural network. During training, the vision-language model integrates specialized skills and domain knowledge, rather than trying to simultaneously learn multiple tasks, resulting in effective multi-modal learning.
-
-
-
-
-
-
-
-
-