-
公开(公告)号:US20230409876A1
公开(公告)日:2023-12-21
申请号:US17845543
申请日:2022-06-21
Applicant: NVIDIA Corporation
Inventor: Vibhor Agrawal , Tamar Viclizki , Vadim Gechman
CPC classification number: G06N3/0454 , G06N3/08
Abstract: Apparatuses, systems, and techniques to predict a probability of an error in processing units, such as those of a data center. In at least one embodiment, the probability of an error occurring in a processing unit is identified using a machine learning model trained using one or more previously trained machine learning models, in which the machine learning model is smaller than the previously trained machine learning models.
-
公开(公告)号:US20240394130A1
公开(公告)日:2024-11-28
申请号:US18794219
申请日:2024-08-05
Applicant: NVIDIA Corporation
Inventor: Tamar Viclizki , Fay Wang , Divyansh Jain , Avighan Majumder , Vadim Gechman , Vibhor Agrawal
Abstract: Apparatuses, systems, and techniques to predict a probability of an error or anomay in processing units, such as those of a data center. In at least one embodiment, the probability of an error occuring in a proccessing unit is identified using multiple trained machine learning models, in which the trained machine learning models each outputs, for example, the probability of an error occuring within a different predetermined time period.
-
公开(公告)号:US12055995B2
公开(公告)日:2024-08-06
申请号:US17683191
申请日:2022-02-28
Applicant: NVIDIA Corporation
Inventor: Tamar Viclizki , Fay Wang , Divyansh Jain , Avighan Majumder , Vadim Gechman , Vibhor Agrawal
CPC classification number: G06F11/004 , G06N20/20 , G06F2201/86
Abstract: Apparatuses, systems, and techniques to predict a probability of an error or anomaly in processing units, such as those of a data center. In at least one embodiment, the probability of an error occurring in a processing unit is identified using multiple trained machine learning models, in which the trained machine learning models each outputs, for example, the probability of an error occurring within a different predetermined time period.
-
公开(公告)号:US20230297453A1
公开(公告)日:2023-09-21
申请号:US17683191
申请日:2022-02-28
Applicant: NVIDIA Corporation
Inventor: Tamar Viclizki , Fay Wang , Divyansh Jain , Avighan Majumder , Vadim Gechman , Vibhor Agrawal
CPC classification number: G06F11/004 , G06N20/20 , G06F2201/86
Abstract: Apparatuses, systems, and techniques to predict a probability of an error or anomay in processing units, such as those of a data center. In at least one embodiment, the probability of an error occuring in a proccessing unit is identified using multiple trained machine learning models, in which the trained machine learning models each outputs, for example, the probability of an error occuring within a different predetermined time period.
-
-
-