-
1.
公开(公告)号:US20240119361A1
公开(公告)日:2024-04-11
申请号:US18348286
申请日:2023-07-06
Applicant: NVIDIA CORPORATION
Inventor: Hongxu YIN , Wonmin BYEON , Jan KAUTZ , Divyam MADAAN , Pavlo MOLCHANOV
IPC: G06N20/00
CPC classification number: G06N20/00
Abstract: One embodiment of a method for training a first machine learning model having a different architecture than a second machine learning model includes receiving a first data set, performing one or more operations to generate a second data set based on the first data set and the second machine learning model, wherein the second data set includes at least one feature associated with one or more tasks that the second machine learning model was previously trained to perform, and performing one or more operations to train the first machine learning model based on the second data set and the second machine learning model.
-
公开(公告)号:US20250094819A1
公开(公告)日:2025-03-20
申请号:US18471184
申请日:2023-09-20
Applicant: NVIDIA CORPORATION
Inventor: Wonmin BYEON , Sudarshan BABU , Shalini DE MELLO , Jan KAUTZ
IPC: G06N3/096 , G06N3/0455
Abstract: One embodiment of the present invention sets forth a technique for executing a transformer neural network. The technique includes executing a first attention unit included in the transformer neural network to convert a first input token into a first query, a first key, and a first plurality of values, where each value included in the first plurality of values represents a sub-task associated with the transformer neural network. The technique also includes computing a first plurality of outputs associated with the first input token based on the first query, the first key, and the first plurality of values. The technique further includes performing a task associated with an input corresponding to the first input token based on the first input token and the first plurality of outputs.
-
公开(公告)号:US20250103906A1
公开(公告)日:2025-03-27
申请号:US18471196
申请日:2023-09-20
Applicant: NVIDIA CORPORATION
Inventor: Wonmin BYEON , Sudarshan BABU , Shalini DE MELLO , Jan KAUTZ
IPC: G06N3/0985 , G06N3/0895
Abstract: One embodiment of the present invention sets forth a technique for performing meta-learning. The technique includes performing a first set of training iterations to convert a prediction learning network into a first trained prediction learning network based on a first support set of training data and executing a representation learning network and the first trained prediction learning network to generate a first set of supervised training output and a first set of self-supervised training output based on a first query set of training data corresponding to the first support set of training data. The technique also includes performing a first training iteration to convert the representation learning network into a first trained representation learning network based on a first loss associated with the first set of supervised training output and a second loss associated with the first set of self-supervised training output.
-
公开(公告)号:US20250095350A1
公开(公告)日:2025-03-20
申请号:US18471209
申请日:2023-09-20
Applicant: NVIDIA CORPORATION
Inventor: Wonmin BYEON , Sudarshan BABU , Shalini DE MELLO , Jan KAUTZ
IPC: G06V10/82 , G06V10/776
Abstract: One embodiment of the present invention sets forth a technique for executing a machine learning model. The technique includes performing a first set of training iterations to convert a prediction learning network into a first trained prediction learning network based on a first support set associated with a first set of classes. The technique also includes executing a first trained representation learning network to convert a first data sample into a first latent representation, where the first trained representation learning network is generated by training a representation learning network using a first query set, a first set of self-supervised losses, and a first set of supervised losses. The technique further includes executing the first trained prediction learning network to convert the first latent representation into a first prediction of a first class that is not included in the second set of classes.
-
公开(公告)号:US20250094813A1
公开(公告)日:2025-03-20
申请号:US18471204
申请日:2023-09-20
Applicant: NVIDIA CORPORATION
Inventor: Wonmin BYEON , Sudarshan BABU , Shalini DE MELLO , Jan KAUTZ
IPC: G06N3/0895
Abstract: One embodiment of the present invention sets forth a technique for training a transformer neural network. The technique includes inputting a first task token and a first set of samples into the transformer neural network and training the transformer neural network using a first set of losses between predictions generated by the transformer neural network from the first task token and first set of samples as well as a first set of labels. The technique also includes converting the first task token into a second task token that is larger than the first task token, inputting the second task token and a second set of samples into the transformer neural network, and training the transformer neural network using a second set of losses between predictions generated by the transformer neural network from the second task token and the second set of samples as well as a second set of labels.
-
-
-
-