摘要:
A fault tolerant storage controller having a processor, redundant copies of a stored program, and a timer that automatically runs when the processor is reset is disclosed. Selection logic selects a first copy of the program to boot on the processor. If the timer expires before the first copy successfully boots, the timer resets the processor and re-enables itself to run again. This time, selection logic selects a second copy of the stored program. In one embodiment, the program comprises separate loader and application programs, each having a redundant copy. The loader re-enables the timer when jumping to the first copy of the application code. If the timer expires before the first application copy successfully boots, the timer resets the processor and re-enables itself to run again. This time, the loader selects a second copy of the application program. In one embodiment, the redundant copies are stored in separate FLASH devices; in another, in distinct regions of the same FLASH device.
摘要:
A RAID system includes a non-volatile memory storing a first program and first and second copies of a second program, and a processor executing the first program. The first program detects the first copy of the second program is failed and repairs the failed first copy in the non-volatile memory using the second copy. The failures may be detected at boot time or during normal operation of the controller. In one embodiment, the failure is detected via a CRC check. In one embodiment, the controller repairs the failed copy by copying the good copy to the location of the failed copy. In one embodiment, the system includes multiple controllers, each having its own processor and non-volatile memory and program that detects and repairs failed program copies. The programs include a loader, an application, FPGA code, CPLD code, and a program for execution by a power supply microcontroller.