-
公开(公告)号:US11948078B2
公开(公告)日:2024-04-02
申请号:US17000048
申请日:2020-08-21
Applicant: Nvidia Corporation
Inventor: Arash Vahdat , Tanmay Gupta , Xiaodong Yang , Jan Kautz
IPC: G06N3/08 , G06F18/214 , G06F18/22 , G06V10/74 , G06V10/82 , G06V30/19 , G06V30/262
CPC classification number: G06N3/08 , G06F18/2148 , G06F18/22 , G06V10/761 , G06V10/82 , G06V30/1916 , G06V30/19173 , G06V30/274
Abstract: The disclosure provides a framework or system for learning visual representation using a large set of image/text pairs. The disclosure provides, for example, a method of visual representation learning, a joint representation learning system, and an artificial intelligence (AI) system that employs one or more of the trained models from the method or system. The AI system can be used, for example, in autonomous or semi-autonomous vehicles. In one example, the method of visual representation learning includes: (1) receiving a set of image embeddings from an image representation model and a set of text embeddings from a text representation model, and (2) training, employing mutual information, a critic function by learning relationships between the set of image embeddings and the set of text embeddings.
-
公开(公告)号:US20210056353A1
公开(公告)日:2021-02-25
申请号:US17000048
申请日:2020-08-21
Applicant: Nvidia Corporation
Inventor: Arash Vahdat , Tanmay Gupta , Xiaodong Yang , Jan Kautz
Abstract: The disclosure provides a framework or system for learning visual representation using a large set of image/text pairs. The disclosure provides, for example, a method of visual representation learning, a joint representation learning system, and an artificial intelligence (AI) system that employs one or more of the trained models from the method or system. The AI system can be used, for example, in autonomous or semi-autonomous vehicles. In one example, the method of visual representation learning includes: (1) receiving a set of image embeddings from an image representation model and a set of text embeddings from a text representation model, and (2) training, employing mutual information, a critic function by learning relationships between the set of image embeddings and the set of text embeddings.
-