Method and apparatus for managing redundant computer-based systems for fault tolerant computing
    1.
    发明授权
    Method and apparatus for managing redundant computer-based systems for fault tolerant computing 有权
    用于管理用于容错计算的冗余计算机系统的方法和装置

    公开(公告)号:US06178522B1

    公开(公告)日:2001-01-23

    申请号:US09140174

    申请日:1998-08-25

    IPC分类号: H02H305

    摘要: A stand alone Redundancy Management System (RMS) provides a cost-effective solution for managing redundant computer-based systems in order to achieve ultra-high system reliability, safety, fault tolerance, and mission success rate. The RMS includes a Cross Channel Data Link (CCDL) module and a Fault Tolerant Executive (FE) module. The CCDL module provides data communication for all channels, while the FTE module performs system functions such as synchronization, data voting, fault and error detection, isolation and recovery. System fault tolerance is achieved by detecting and masking erroneous data through data voting, and system integrity is ensured by a dynamically reconfigurable architecture that is capable of excluding faulty nodes from the system and re-admitting healthy nodes back into the system.

    摘要翻译: 冗余管理系统(RMS)为管理冗余计算机系统提供了一种经济高效的解决方案,以实现超高的系统可靠性,安全性,容错能力和任务成功率。 RMS包括交叉通道数据链路(CCDL)模块和容错执行(FE)模块。 CCDL模块为所有通道提供数据通信,而FTE模块执行系统功能,如同步,数据投票,故障和错误检测,隔离和恢复。 通过数据投票检测和掩蔽错误数据来实现系统容错,并且通过能够从系统排除故障节点并将健康节点重新进入系统的动态可重构架构来确保系统完整性。