Timer-based apparatus and method for fault-tolerant booting of a storage controller
    1.
    发明授权
    Timer-based apparatus and method for fault-tolerant booting of a storage controller 有权
    基于定时器的存储控制器容错引导的设备和方法

    公开(公告)号:US07523350B2

    公开(公告)日:2009-04-21

    申请号:US11140106

    申请日:2005-05-27

    IPC分类号: G06F11/00

    CPC分类号: G06F11/1417

    摘要: A fault tolerant storage controller having a processor, redundant copies of a stored program, and a timer that automatically runs when the processor is reset is disclosed. Selection logic selects a first copy of the program to boot on the processor. If the timer expires before the first copy successfully boots, the timer resets the processor and re-enables itself to run again. This time, selection logic selects a second copy of the stored program. In one embodiment, the program comprises separate loader and application programs, each having a redundant copy. The loader re-enables the timer when jumping to the first copy of the application code. If the timer expires before the first application copy successfully boots, the timer resets the processor and re-enables itself to run again. This time, the loader selects a second copy of the application program. In one embodiment, the redundant copies are stored in separate FLASH devices; in another, in distinct regions of the same FLASH device.

    摘要翻译: 具有处理器的容错存储控制器,存储的程序的冗余副本以及当处理器复位时自动运行的定时器被公开。 选择逻辑选择程序的第一个副本以在处理器上引导。 如果定时器在第一个副本成功引导之前到期,定时器将重置该处理器并重新启用其自身再次运行。 这次,选择逻辑选择存储的程序的第二副本。 在一个实施例中,程序包括单独的加载程序和应用程序,每个程序具有冗余副本。 加载程序在跳转到应用程序代码的第一个副本时重新启用定时器。 如果定时器在第一个应用程序复制成功引导之前到期,定时器将重置该处理器并重新启用其自身再次运行。 这次,加载程序选择应用程序的第二个副本。 在一个实施例中,冗余副本存储在单独的FLASH设备中; 在另一个,在相同FLASH设备的不同区域。

    Storage system with automatic redundant code component failure detection, notification, and repair
    2.
    发明授权
    Storage system with automatic redundant code component failure detection, notification, and repair 有权
    存储系统具有自动冗余代码组件故障检测,通知和修复

    公开(公告)号:US07711989B2

    公开(公告)日:2010-05-04

    申请号:US11279376

    申请日:2006-04-11

    IPC分类号: G06F11/07

    摘要: A RAID system includes a non-volatile memory storing a first program and first and second copies of a second program, and a processor executing the first program. The first program detects the first copy of the second program is failed and repairs the failed first copy in the non-volatile memory using the second copy. The failures may be detected at boot time or during normal operation of the controller. In one embodiment, the failure is detected via a CRC check. In one embodiment, the controller repairs the failed copy by copying the good copy to the location of the failed copy. In one embodiment, the system includes multiple controllers, each having its own processor and non-volatile memory and program that detects and repairs failed program copies. The programs include a loader, an application, FPGA code, CPLD code, and a program for execution by a power supply microcontroller.

    摘要翻译: RAID系统包括存储第一程序的非易失性存储器和第二程序的第一和第二副本以及执行第一程序的处理器。 第一个程序检测到第二个程序的第一个副本失败,并使用第二个副本修复非易失性存储器中的失败的第一个副本。 可能在启动时或在控制器的正常操作期间检测到故障。 在一个实施例中,通过CRC校验来检测故障。 在一个实施例中,控制器通过将良好副本复制到失败副本的位置来修复失败的副本。 在一个实施例中,系统包括多个控制器,每个控制器具有其自己的处理器和非易失性存储器以及检测和修复失败的程序副本的程序。 程序包括一个加载程序,一个应用程序,FPGA代码,CPLD代码以及供电微控制器执行的程序。