-
公开(公告)号:US06654903B1
公开(公告)日:2003-11-25
申请号:US09574440
申请日:2000-05-20
申请人: Daniel J. Sullivan, Jr. , Terrence S. Pearson , Barbara A. Fox , Joseph D. Kidder , Umesh Bhatt
发明人: Daniel J. Sullivan, Jr. , Terrence S. Pearson , Barbara A. Fox , Joseph D. Kidder , Umesh Bhatt
IPC分类号: G06F1100
CPC分类号: H04L7/0008 , H04L29/06 , H04L41/0803 , H04L41/22 , H04L45/50 , H04L69/18
摘要: The invention provides a method for fault isolation in a computer system, such as a network device. The method calls for providing a plurality of modular processes, and forming groups, based on hardware in the computer system, of one or more of the plurality of modular processes. A fault within a group is detected, and recovery from the detected fault is accomplished without affecting processes or hardware in other groups.
摘要翻译: 本发明提供了一种用于诸如网络设备的计算机系统中的故障隔离的方法。 该方法要求提供多个模块化过程,并且基于计算机系统中的硬件形成多个模块化过程中的一个或多个的组。 检测到组内的故障,从检测到的故障中恢复,而不影响其他组中的过程或硬件。
-
公开(公告)号:US06332198B1
公开(公告)日:2001-12-18
申请号:US09591193
申请日:2000-06-09
申请人: Corey Simons , Terrence S. Pearson , Chris R. Noel , Joseph D. Kidder , Brian Branscomb , Nicholas A. Langrind , Daniel J. Sullivan , Barbara A. Fox
发明人: Corey Simons , Terrence S. Pearson , Chris R. Noel , Joseph D. Kidder , Brian Branscomb , Nicholas A. Langrind , Daniel J. Sullivan , Barbara A. Fox
IPC分类号: G06F1300
CPC分类号: H04L7/0008 , G06F1/14 , H04J3/0685 , H04L29/06 , H04L41/0816 , H04L41/082 , H04L41/0859 , H04L41/0863 , H04L41/22 , H04L45/50 , H04L69/18
摘要: The present invention provides a method and apparatus for supporting multiple redundancy schemes in a single network device. In one network device, various redundancy schemes are supported including 1:1, 1+1, 1:N, no redundancy or a combination of redundancy schemes. In addition, the redundancy scheme or schemes for physical network device cards (i.e., universal port cards) or ports may be different from the redundancy scheme or schemes for forwarding network device cards. For example, a network manager may want to provide 1:1 or 1+1 redundancy for all universal port cards and/or ports but only 1:N redundancy for each N group of forwarding cards. As another example, the network manager may provide certain customers with 1:1 redundancy on both universal port cards (or ports) and forwarding cards to ensure that customer's network availability while providing other customers, with lower availability requirements, with various other redundancy scheme combinations, for example, 1:1, 1+1, 1:N or no redundancy for port cards or ports and 1:N or no redundancy for forwarding cards. The present invention allows customers having different availability/redundancy needs to be serviced by same network device.
摘要翻译: 本发明提供一种用于在单个网络设备中支持多个冗余方案的方法和装置。 在一个网络设备中,支持各种冗余方案,包括1:1,1 + 1,1:N,无冗余或冗余方案的组合。 此外,用于物理网络设备卡(即,通用端口卡)或端口的冗余方案或方案可以不同于用于转发网络设备卡的冗余方案或方案。 例如,网络管理员可能希望为所有通用端口卡和/或端口提供1:1或1 + 1冗余,但每个N组转发卡只能提供1:N冗余。 另一个例子,网络管理员可以在两个通用端口卡(或端口)和转发卡上为某些客户提供1:1冗余,以确保客户的网络可用性,同时以更低的可用性要求为其他客户提供各种其他冗余方案组合 例如1:1,1 + 1,1:N或端口卡或端口没有冗余,1:N或转发卡的冗余。 本发明允许具有不同可用性/冗余的客户需要由相同的网络设备进行服务。
-
公开(公告)号:US06694450B1
公开(公告)日:2004-02-17
申请号:US09574439
申请日:2000-05-20
申请人: Joseph D. Kidder , Nicholas A. Langrind , Daniel J. Sullivan, Jr. , Barbara A. Fox , Richard L. Whitesel
发明人: Joseph D. Kidder , Nicholas A. Langrind , Daniel J. Sullivan, Jr. , Barbara A. Fox , Richard L. Whitesel
IPC分类号: G06F1100
CPC分类号: G06F11/1438 , G06F11/1482 , G06F11/202 , Y10S707/99953 , Y10S707/99955
摘要: A distributed software redundancy design is disclosed to minimize network outages and other problems associated with component/process failures by spreading software backup (in the so-called “hot state”) across multiple elements. The distributed redundancy architecture of the present invention also permits the location of the hardware backup element to float, that is, if a primary element fails, the functions can be transferred over to the backup element. When the failed primary element is replaced, the replacement hardware can serve as the hardware backup. If one or more of the primary processes on a particular element experiences a software fault, the processor on the line card may terminate and restart the failing process or processes. Once the process or processes are restarted, a copy of the last known dynamic state (i.e., the backup state) can be retrieved a from corresponding backup processes executing on a second line card and initiate an audit process to synchronize retrieved state with the dynamic state of associated other processes.
摘要翻译: 公开了分布式软件冗余设计,以通过在多个元件之间传播软件备份(在所谓的“热状态”)来最小化与组件/过程故障相关联的网络中断和其他问题。 本发明的分布式冗余架构还允许硬件备份元件的位置浮动,即,如果主要元件发生故障,则可以将功能传送到备用元件。 当更换失败的主元素时,替换硬件可以用作硬件备份。 如果特定元件上的一个或多个主要进程遇到软件故障,则线卡上的处理器可能会终止并重新启动失败的进程或进程。 一旦重新启动进程或进程,可以从在第二行卡上执行的相应的备份进程检索最后一个已知的动态状态(即备份状态)的副本,并启动审核进程以将检索状态与动态状态同步 的相关的其他过程。
-
公开(公告)号:US06983362B1
公开(公告)日:2006-01-03
申请号:US09574352
申请日:2000-05-20
IPC分类号: G06F15/177
CPC分类号: G06F11/0709 , G06F11/0793 , G06F11/1438 , G06F11/20 , G06F11/202 , H04L41/0213 , H04L41/0631 , H04L41/0672 , H04L41/0893 , H04L69/40 , Y10S707/99932 , Y10S707/99936 , Y10S707/99953
摘要: Computer systems and methods of data processing are disclosed in which fault/event management is carried out in accordance with a configurable fault recovery policy. In addition, computer systems and methods of data processing are disclosed in which hierarchical levels of fault management (or more generally “event” management) are provided in accordance with the configurable fault policy.
摘要翻译: 公开了数据处理的计算机系统和方法,其中根据可配置的故障恢复策略进行故障/事件管理。 此外,公开了数据处理的计算机系统和方法,其中根据可配置的故障策略提供了层级的故障管理(或更一般地“事件”管理)。
-
公开(公告)号:US06715097B1
公开(公告)日:2004-03-30
申请号:US09574436
申请日:2000-05-20
IPC分类号: G06F1100
CPC分类号: G06F11/0775 , G06F11/0709 , G06F11/0715 , G06F11/0784 , G06F11/0793
摘要: Computer systems and methods of data processing are disclosed in which hierarchical levels of fault/event management are provided that intelligently monitor hardware and software and proactively take action in accordance with a defined fault policy. A fault policy based on a defined hierarchy ensures that for each particular type of failure the most appropriate action is taken. In one embodiment, a master Software Resiliency Manager (SRM) serves as the top hierarchical level fault/event manager, with one or more slave SRMs serving as the next hierarchical level fault/event manager. The software applications resident on each board can also include sub-processes (e.g., local resiliency managers or LRMs) that serve as the lowest hierarchical level fault/event managers.
摘要翻译: 公开了数据处理的计算机系统和方法,其中提供了智能地监视硬件和软件并根据定义的故障策略主动采取行动的故障/事件管理的分层级。 基于定义的层次结构的故障策略确保了对于每种特定类型的故障,采取最合适的操作。 在一个实施例中,主软件弹性管理器(SRM)用作顶级层级故障/事件管理器,其中一个或多个从SRM用作下一层级故障/事件管理器。 驻留在每个板上的软件应用程序还可以包括用作最低层级故障/事件管理器的子进程(例如,本地弹性管理器或LRM)。
-
6.
公开(公告)号:US06880086B2
公开(公告)日:2005-04-12
申请号:US09777468
申请日:2001-02-05
CPC分类号: H04L41/22 , G06F1/14 , H04J3/0685 , H04L7/0008 , H04L29/06 , H04L41/082 , H04L41/0843 , H04L41/0856 , H04L41/0866 , H04L41/0889 , H04L45/50 , H04L63/102 , H04L63/105 , H04L63/12 , H04L69/18
摘要: The present invention provides a method and apparatus for facilitating hot upgrades of software components within a telecommunications network device through the use of “signatures” generated by a signature generating program. After installation of a new software release within the network device, only those software components whose signatures do not match the signatures of corresponding and currently executing software components are upgraded. Signatures promote hot upgrades by identifying only those software components that need to be upgraded. Since signatures are automatically generated for each software component as part of putting together a new release a quick comparison of two signatures provides an accurate assurance that either the software component has changed or has not. Thus, signatures provide a quick, easy way to accurately determine the upgrade status of each software component.
摘要翻译: 本发明提供了一种通过使用由签名生成程序生成的“签名”来促进电信网络设备内的软件组件的热升级的方法和装置。 在网络设备中安装新的软件版本后,只有那些签名不符合相应和正在执行的软件组件的签名的软件组件才能升级。 签名通过仅识别需要升级的软件组件来促进热升级。 由于每个软件组件都会自动生成签名,作为组合新版本的一部分,两个签名的快速比较可以提供软件组件已更改或尚未更改的准确保证。 因此,签名提供了一种快速,简单的方法来准确地确定每个软件组件的升级状态。
-
公开(公告)号:US06708291B1
公开(公告)日:2004-03-16
申请号:US09574340
申请日:2000-05-20
申请人: Joseph D. Kidder
发明人: Joseph D. Kidder
IPC分类号: G06F1100
CPC分类号: G06F11/0775 , G06F11/0709 , G06F11/0784 , G06F11/0793 , G06F11/327
摘要: Computer systems and methods of data processing are disclosed in which hierarchical descriptors define levels of fault/event management to intelligently monitor hardware and software and proactively take action in accordance with a defined fault policy. A fault policy based on a defined hierarchy ensures that for each particular type of failure the most appropriate action is taken. Hierarchical descriptors can be used to provide information specific to each failure or event. The hierarchical descriptors provide granularity with which to report faults, take action based on fault history and apply fault recovery policies. The descriptors can be stored in a master event log file or local event log files through which faults and events may be tracked and displayed to the user and allow for fault detection at a fine granular level and proactive response to events. In addition, the descriptors can be matched with descriptors in a fault policy to determine the recovery action to be taken.
摘要翻译: 公开了数据处理的计算机系统和方法,其中分层描述符定义故障/事件管理的级别,以智能地监视硬件和软件,并根据定义的故障策略主动采取行动。 基于定义的层次结构的故障策略确保了对于每种特定类型的故障,采取最合适的操作。 分层描述符可用于提供特定于每个故障或事件的信息。 分层描述符提供报告故障的粒度,根据故障历史采取行动并应用故障恢复策略。 描述符可以存储在主事件日志文件或本地事件日志文件中,通过这些文件可以跟踪和显示故障和事件给用户,并允许以细粒度级别进行故障检测和对事件的主动响应。 此外,描述符可以与故障策略中的描述符匹配,以确定要采取的恢复操作。
-
公开(公告)号:US06742134B1
公开(公告)日:2004-05-25
申请号:US09574965
申请日:2000-05-20
IPC分类号: F06F1100
CPC分类号: G06F11/0793 , G06F8/65 , G06F11/0709 , G06F11/0715 , G06F11/0775 , G06F11/0784 , G06F11/1433 , G06F11/1438 , G06F11/1441 , G06F11/1482 , G06F11/20
摘要: The present invention provides a computer system having a control process and a device driver process that is in communication with the control process, and a local back-up process, independent of both the control process and the device driver process. The local back-up process facilitates recovery of the device driver process. In one aspect of the invention, the computer system is a network device that includes a control plane and a data plan. The control plane includes a control process, and the data plane includes a device driver process. A local back-up process, independent of both the control process and the device driver process, facilitates recovery of the device driver process if the device driver process is terminated.
摘要翻译: 本发明提供一种具有与控制过程通信的控制过程和设备驱动程序进程的计算机系统以及与控制过程和设备驱动程序进程无关的本地备份过程。 本地备份过程有助于恢复设备驱动程序进程。 在本发明的一个方面,计算机系统是包括控制平面和数据计划的网络设备。 控制平面包括控制过程,数据平面包括设备驱动程序。 独立于控制过程和设备驱动程序过程的本地备份过程有助于设备驱动程序进程的恢复,如果设备驱动程序进程终止。
-
-
-
-
-
-
-