-
公开(公告)号:US11863390B1
公开(公告)日:2024-01-02
申请号:US17888999
申请日:2022-08-16
Applicant: Nvidia Corporation
Inventor: Miriam Menes , Eitan Zahavi , Gil Bloch , Ahmad Atamli , Meni Orenbach , Mark Hummel , Glenn Dearth
IPC: G06F15/177 , H04L41/0873 , H04L45/488
CPC classification number: H04L41/0873 , H04L45/488
Abstract: Apparatuses, systems, and techniques are presented to configure computing resources to perform various tasks. In at least one embodiment, an approach presented herein can be used to verify whether a network of computing nodes is properly configured based, at least in part, on one or more expected data strings generated by the network of computing nodes.
-
公开(公告)号:US20240406058A1
公开(公告)日:2024-12-05
申请号:US18629132
申请日:2024-04-08
Applicant: Nvidia Corporation
Inventor: Elad Alon , Eitan Zahavi , Gaby Diengott , Shie Mannor , Vadim Gechman
IPC: H04L41/0659 , H04L41/147 , H04L43/06 , H04L43/0811
Abstract: A network monitor may execute, or communicate with, one or more stored machine learning models that are trained to predict a failure probability for one or more ports and/or links within a network fabric. Systems and methods may monitor a set of ports and/or links to generate predictions for failure probabilities using a first trained model and low frequency telemetry data. For a subset of ports and/or links with failure probabilities exceeding a first threshold, high speed telemetry data may be used by a second trained model to generate predictions for failure probabilities for the subset of ports. Suspicious ports may then be isolated and undergo various remediation and/or monitoring actions prior to de-isolating the isolated ports.
-
公开(公告)号:US12206748B2
公开(公告)日:2025-01-21
申请号:US17958139
申请日:2022-09-30
Applicant: NVIDIA Corporation
Inventor: Siddha Ganju , Elad Mentovich , Michael Balint , Eitan Zahavi , Michael Sabotta , Michael Norman , Ryan Wells
Abstract: A method includes receiving, using a processing device, a first condition associated with an operation at a data center, where the operation at the data center pertains to a first location at the data center, the first location corresponding to a first parameter value. The method further includes providing the first condition as an input to a machine learning model. The method also includes performing one or more reinforcement learning techniques using the machine learning model to cause the machine learning model to output an indication of a final location associated with the operation, where the final location corresponds to a final parameter value that is closer to a target than the first parameter value corresponding to the first location at the data center.
-
公开(公告)号:US20240129380A1
公开(公告)日:2024-04-18
申请号:US17958139
申请日:2022-09-30
Applicant: NVIDIA Corporation
Inventor: Siddha Ganju , Elad Mentovich , Michael Balint , Eitan Zahavi , Michael Sabotta , Michael Norman , Ryan Wells
CPC classification number: H04L67/60 , G06F11/3062 , H04L41/16
Abstract: A method includes receiving, using a processing device, a first condition associated with an operation at a data center, where the operation at the data center pertains to a first location at the data center, the first location corresponding to a first parameter value. The method further includes providing the first condition as an input to a machine learning model. The method also includes performing one or more reinforcement learning techniques using the machine learning model to cause the machine learning model to output an indication of a final location associated with the operation, where the final location corresponds to a final parameter value that is closer to a target than the first parameter value corresponding to the first location at the data center.
-
-
-