Network device for supporting multiple redundancy schemes
    2.
    发明授权
    Network device for supporting multiple redundancy schemes 有权
    用于支持多种冗余方案的网络设备

    公开(公告)号:US06332198B1

    公开(公告)日:2001-12-18

    申请号:US09591193

    申请日:2000-06-09

    IPC分类号: G06F1300

    摘要: The present invention provides a method and apparatus for supporting multiple redundancy schemes in a single network device. In one network device, various redundancy schemes are supported including 1:1, 1+1, 1:N, no redundancy or a combination of redundancy schemes. In addition, the redundancy scheme or schemes for physical network device cards (i.e., universal port cards) or ports may be different from the redundancy scheme or schemes for forwarding network device cards. For example, a network manager may want to provide 1:1 or 1+1 redundancy for all universal port cards and/or ports but only 1:N redundancy for each N group of forwarding cards. As another example, the network manager may provide certain customers with 1:1 redundancy on both universal port cards (or ports) and forwarding cards to ensure that customer's network availability while providing other customers, with lower availability requirements, with various other redundancy scheme combinations, for example, 1:1, 1+1, 1:N or no redundancy for port cards or ports and 1:N or no redundancy for forwarding cards. The present invention allows customers having different availability/redundancy needs to be serviced by same network device.

    摘要翻译: 本发明提供一种用于在单个网络设备中支持多个冗余方案的方法和装置。 在一个网络设备中,支持各种冗余方案,包括1:1,1 + 1,1:N,无冗余或冗余方案的组合。 此外,用于物理网络设备卡(即,通用端口卡)或端口的冗余方案或方案可以不同于用于转发网络设备卡的冗余方案或方案。 例如,网络管理员可能希望为所有通用端口卡和/或端口提供1:1或1 + 1冗余,但每个N组转发卡只能提供1:N冗余。 另一个例子,网络管理员可以在两个通用端口卡(或端口)和转发卡上为某些客户提供1:1冗余,以确保客户的网络可用性,同时以更低的可用性要求为其他客户提供各种其他冗余方案组合 例如1:1,1 + 1,1:N或端口卡或端口没有冗余,1:N或转发卡的冗余。 本发明允许具有不同可用性/冗余的客户需要由相同的网络设备进行服务。

    Distributed process redundancy
    3.
    发明授权
    Distributed process redundancy 有权
    分布式流程冗余

    公开(公告)号:US06694450B1

    公开(公告)日:2004-02-17

    申请号:US09574439

    申请日:2000-05-20

    IPC分类号: G06F1100

    摘要: A distributed software redundancy design is disclosed to minimize network outages and other problems associated with component/process failures by spreading software backup (in the so-called “hot state”) across multiple elements. The distributed redundancy architecture of the present invention also permits the location of the hardware backup element to float, that is, if a primary element fails, the functions can be transferred over to the backup element. When the failed primary element is replaced, the replacement hardware can serve as the hardware backup. If one or more of the primary processes on a particular element experiences a software fault, the processor on the line card may terminate and restart the failing process or processes. Once the process or processes are restarted, a copy of the last known dynamic state (i.e., the backup state) can be retrieved a from corresponding backup processes executing on a second line card and initiate an audit process to synchronize retrieved state with the dynamic state of associated other processes.

    摘要翻译: 公开了分布式软件冗余设计,以通过在多个元件之间传播软件备份(在所谓的“热状态”)来最小化与组件/过程故障相关联的网络中断和其他问题。 本发明的分布式冗余架构还允许硬件备份元件的位置浮动,即,如果主要元件发生故障,则可以将功能传送到备用元件。 当更换失败的主元素时,替换硬件可以用作硬件备份。 如果特定元件上的一个或多个主要进程遇到软件故障,则线卡上的处理器可能会终止并重新启动失败的进程或进程。 一旦重新启动进程或进程,可以从在第二行卡上执行的相应的备份进程检索最后一个已知的动态状态(即备份状态)的副本,并启动审核进程以将检索状态与动态状态同步 的相关的其他过程。

    Hierarchical fault management in computer systems
    5.
    发明授权
    Hierarchical fault management in computer systems 有权
    计算机系统中的分层故障管理

    公开(公告)号:US06715097B1

    公开(公告)日:2004-03-30

    申请号:US09574436

    申请日:2000-05-20

    IPC分类号: G06F1100

    摘要: Computer systems and methods of data processing are disclosed in which hierarchical levels of fault/event management are provided that intelligently monitor hardware and software and proactively take action in accordance with a defined fault policy. A fault policy based on a defined hierarchy ensures that for each particular type of failure the most appropriate action is taken. In one embodiment, a master Software Resiliency Manager (SRM) serves as the top hierarchical level fault/event manager, with one or more slave SRMs serving as the next hierarchical level fault/event manager. The software applications resident on each board can also include sub-processes (e.g., local resiliency managers or LRMs) that serve as the lowest hierarchical level fault/event managers.

    摘要翻译: 公开了数据处理的计算机系统和方法,其中提供了智能地监视硬件和软件并根据定义的故障策略主动采取行动的故障/事件管理的分层级。 基于定义的层次结构的故障策略确保了对于每种特定类型的故障,采取最合适的操作。 在一个实施例中,主软件弹性管理器(SRM)用作顶级层级故障/事件管理器,其中一个或多个从SRM用作下一层级故障/事件管理器。 驻留在每个板上的软件应用程序还可以包括用作最低层级故障/事件管理器的子进程(例如,本地弹性管理器或LRM)。

    Hierarchical fault descriptors in computer systems
    7.
    发明授权
    Hierarchical fault descriptors in computer systems 有权
    计算机系统中的分层故障描述符

    公开(公告)号:US06708291B1

    公开(公告)日:2004-03-16

    申请号:US09574340

    申请日:2000-05-20

    申请人: Joseph D. Kidder

    发明人: Joseph D. Kidder

    IPC分类号: G06F1100

    摘要: Computer systems and methods of data processing are disclosed in which hierarchical descriptors define levels of fault/event management to intelligently monitor hardware and software and proactively take action in accordance with a defined fault policy. A fault policy based on a defined hierarchy ensures that for each particular type of failure the most appropriate action is taken. Hierarchical descriptors can be used to provide information specific to each failure or event. The hierarchical descriptors provide granularity with which to report faults, take action based on fault history and apply fault recovery policies. The descriptors can be stored in a master event log file or local event log files through which faults and events may be tracked and displayed to the user and allow for fault detection at a fine granular level and proactive response to events. In addition, the descriptors can be matched with descriptors in a fault policy to determine the recovery action to be taken.

    摘要翻译: 公开了数据处理的计算机系统和方法,其中分层描述符定义故障/事件管理的级别,以智能地监视硬件和软件,并根据定义的故障策略主动采取行动。 基于定义的层次结构的故障策略确保了对于每种特定类型的故障,采取最合适的操作。 分层描述符可用于提供特定于每个故障或事件的信息。 分层描述符提供报告故障的粒度,根据故障历史采取行动并应用故障恢复策略。 描述符可以存储在主事件日志文件或本地事件日志文件中,通过这些文件可以跟踪和显示故障和事件给用户,并允许以细粒度级别进行故障检测和对事件的主动响应。 此外,描述符可以与故障策略中的描述符匹配,以确定要采取的恢复操作。

    Maintaining a local backup for data plane processes
    8.
    发明授权
    Maintaining a local backup for data plane processes 有权
    维护数据平面进程的本地备份

    公开(公告)号:US06742134B1

    公开(公告)日:2004-05-25

    申请号:US09574965

    申请日:2000-05-20

    IPC分类号: F06F1100

    摘要: The present invention provides a computer system having a control process and a device driver process that is in communication with the control process, and a local back-up process, independent of both the control process and the device driver process. The local back-up process facilitates recovery of the device driver process. In one aspect of the invention, the computer system is a network device that includes a control plane and a data plan. The control plane includes a control process, and the data plane includes a device driver process. A local back-up process, independent of both the control process and the device driver process, facilitates recovery of the device driver process if the device driver process is terminated.

    摘要翻译: 本发明提供一种具有与控制过程通信的控制过程和设备驱动程序进程的计算机系统以及与控制过程和设备驱动程序进程无关的本地备份过程。 本地备份过程有助于恢复设备驱动程序进程。 在本发明的一个方面,计算机系统是包括控制平面和数据计划的网络设备。 控制平面包括控制过程,数据平面包括设备驱动程序。 独立于控制过程和设备驱动程序过程的本地备份过程有助于设备驱动程序进程的恢复,如果设备驱动程序进程终止。