-
1.
公开(公告)号:US09064216B2
公开(公告)日:2015-06-23
申请号:US13842909
申请日:2013-03-15
Applicant: Juniper Networks, Inc.
Inventor: Rajeshekar Reddy , Harshad Bhaskar Nakil
CPC classification number: G06N99/005 , G06F11/008 , H04L41/0631 , H04L41/147 , H04L43/04 , H04L43/0852 , H04L43/10 , H04L45/16 , H04L45/38 , H04L45/42 , H04L45/48 , H04L45/586 , H04L61/103 , H04L69/40
Abstract: In general, techniques are described for automatically identifying likely faulty components in massively distributed complex systems. In some examples, snapshots of component parameters are automatically repeatedly fed to a pre-trained classifier and the classifier indicates whether each received snapshot is likely to belong to a fault and failure class or to a non-fault/failure class. Components whose snapshots indicate a high likelihood of fault or failure are investigated, restarted or taken off line as a pre-emptive measure. The techniques may be applied in a massively distributed complex system such as a data center.
Abstract translation: 通常,描述了用于自动识别大规模分布式复杂系统中可能的故障组件的技术。 在一些示例中,组件参数的快照自动重复馈送到预训练的分类器,分类器指示每个接收到的快照是否可能属于故障类和故障类或非故障/故障类。 快照指示故障或故障的可能性高的组件将作为先发制人的措施进行调查,重新启动或脱机。 这些技术可以应用在诸如数据中心的大规模分布式复杂系统中。