-
公开(公告)号:US20250078489A1
公开(公告)日:2025-03-06
申请号:US18542423
申请日:2023-12-15
Applicant: NVIDIA CORPORATION
Inventor: Bingyin ZHAO , Jose Manuel ALVAREZ LOPEZ , Anima ANANDKUMAR , Shi Yi LAN , Zhiding YU
Abstract: One embodiment of the present invention sets forth a technique for training an image classifier. The technique includes training a first vision transformer model to generate patch labels for corresponding images patches of images, converting the patch labels to token labels, and training a second vision transformer model to classify images based on the token labels.
-
公开(公告)号:US20240013504A1
公开(公告)日:2024-01-11
申请号:US17977884
申请日:2022-10-31
Applicant: NVIDIA CORPORATION
Inventor: Zhiding YU , Boyi LI , Chaowei XIAO , De-An HUANG , Weili NIE , Linxi FAN , Anima ANANDKUMAR
IPC: G06V10/26 , G06V10/774 , G06V10/77 , G06V10/80 , G06F40/284
CPC classification number: G06V10/26 , G06V10/774 , G06V10/7715 , G06V10/80 , G06F40/284
Abstract: One embodiment of a method for training a machine learning model includes receiving a training data set that includes at least one image, text referring to at least one object included in the at least one image, and at least one bounding box annotation associated with the at least one object, and performing, based on the training data set, one or more operations to generate a trained machine learning model to segment images based on text, where the one or more operations to generate the trained machine learning model include minimizing a loss function that comprises at least one of a multiple instance learning loss term or an energy loss term
-
公开(公告)号:US20240386586A1
公开(公告)日:2024-11-21
申请号:US18320265
申请日:2023-05-19
Applicant: NVIDIA Corporation
Inventor: Alperen DEGIRMENCI , Jiwoong CHOI , Zhiding YU , Ke CHEN , Shubhranshu SINGH , Yashar ASGARIEH , Subhashree RADHAKRISHNAN , James SKINNER , Jose Manuel ALVAREZ LOPEZ
Abstract: In various examples, systems and methods are disclosed relating to using neural networks for object detection or instance/semantic segmentation for, without limitation, autonomous or semi-autonomous systems and applications. In some implementations, one or more neural networks receive an image (or other sensor data representation) and a bounding shape corresponding to at least a portion of an object in the image. The bounding shape can include or be labeled with an identifier, class, and/or category of the object. The neural network can determine a mask for the object based at least on processing the image and the bounding shape. The mask can be used for various applications, such as annotating masks for vehicle or machine perception and navigation processes.
-
-