摘要:
A computer system has a plurality of processing units (2-1,2-2,2-n) connected via one or more system buses (1-1,1-2). Each processing unit (2-1,2-2,2-n) has three or more processors (20-1,20-2,20-3) on a common support board (PL) and controlled by a common clock unit (1000). The three processors (20-1,20-2,20-3) perform the same operation and a fault in a processor (20-1,20- 2,20-3) is detected by comparison of the operations of the three processors (20-1,20-2,20-3). If one processor (20-1,20-2,20-3) fails, the operation can continue in the other two processors (20-1,20-2,20-3) of the processing unit (2-1,2-2,2-n), at least temporarily, before replacement of the entire processing unit (2-1,2- 2,2-n). Furthermore, the processing unit (2-1,2-2,2-n) may have a plurality of clocks (A,B) within the clock unit (1000), with a switching arrangement so that the processors (20-1,20-2,20-n) normally receive clock pulses from a main clock (A), but receive pulses from an auxiliary clock (B) if the main clock (A) fails. Switching between the main and auxiliary clock (A,B) involves comparison of the pulse duration from the clocks (A,B). Additionally, a plurality of cache memories (220,221) may be connected in common to the processors (20-1,20-2,20-3), so that failure of one cache memory (220,221) permits the processing unit (2-1,2-2,2-n) to continue to operate using the other cache memory (220,221). Coherence of the contents of the cache memories (220,221) may be achieved by direct comparison, and a comparison method can also be used to invalidate data in an internal cache memory (2020- 1,2020-2,2020-3) of a processor (20-1,20-2,20-3) which differs from that in the external cache memory (220,221). Coherence of protocols may also ensure that data in caches (220,221) of the different processor units (2-1,2-2,2-n) are always correct.
摘要:
A computer system has a plurality of processing units (2-1,2-2,2-n) connected via one or more system buses (1-1,1-2). Each processing unit (2-1,2-2,2-n) has three or more processors (20-1,20-2,20-3) on a common support board (PL) and controlled by a common clock unit (1000). The three processors (20-1,20-2,20-3) perform the same operation and a fault in a processor (20-1,20- 2,20-3) is detected by comparison of the operations of the three processors (20-1,20-2,20-3). If one processor (20-1,20-2,20-3) fails, the operation can continue in the other two processors (20-1,20-2,20-3) of the processing unit (2-1,2-2,2-n), at least temporarily, before replacement of the entire processing unit (2-1,2- 2,2-n). Furthermore, the processing unit (2-1,2-2,2-n) may have a plurality of clocks (A,B) within the clock unit (1000), with a switching arrangement so that the processors (20-1,20-2,20-n) normally receive clock pulses from a main clock (A), but receive pulses from an auxiliary clock (B) if the main clock (A) fails. Switching between the main and auxiliary clock (A,B) involves comparison of the pulse duration from the clocks (A,B). Additionally, a plurality of cache memories (220,221) may be connected in common to the processors (20-1,20-2,20-3), so that failure of one cache memory (220,221) permits the processing unit (2-1,2-2,2-n) to continue to operate using the other cache memory (220,221). Coherence of the contents of the cache memories (220,221) may be achieved by direct comparison, and a comparison method can also be used to invalidate data in an internal cache memory (2020- 1,2020-2,2020-3) of a processor (20-1,20-2,20-3) which differs from that in the external cache memory (220,221). Coherence of protocols may also ensure that data in caches (220,221) of the different processor units (2-1,2-2,2-n) are always correct.
摘要:
A computer system has a plurality of processing units (2-1,2-2,2-n) connected via one or more system buses (1-1,1-2). Each processing unit (2-1,2-2,2-n) has three or more processors (20-1,20-2,20-3) on a common support board (PL) and controlled by a common clock unit (1000). The three processors (20-1,20-2,20-3) perform the same operation and a fault in a processor (20-1,20- 2,20-3) is detected by comparison of the operations of the three processors (20-1,20-2,20-3). If one processor (20-1,20-2,20-3) fails, the operation can continue in the other two processors (20-1,20-2,20-3) of the processing unit (2-1,2-2,2-n), at least temporarily, before replacement of the entire processing unit (2-1,2- 2,2-n). Furthermore, the processing unit (2-1,2-2,2-n) may have a plurality of clocks (A,B) within the clock unit (1000), with a switching arrangement so that the processors (20-1,20-2,20-n) normally receive clock pulses from a main clock (A), but receive pulses from an auxiliary clock (B) if the main clock (A) fails. Switching between the main and auxiliary clock (A,B) involves comparison of the pulse duration from the clocks (A,B). Additionally, a plurality of cache memories (220,221) may be connected in common to the processors (20-1,20-2,20-3), so that failure of one cache memory (220,221) permits the processing unit (2-1,2-2,2-n) to continue to operate using the other cache memory (220,221). Coherence of the contents of the cache memories (220,221) may be achieved by direct comparison, and a comparison method can also be used to invalidate data in an internal cache memory (2020- 1,2020-2,2020-3) of a processor (20-1,20-2,20-3) which differs from that in the external cache memory (220,221). Coherence of protocols may also ensure that data in caches (220,221) of the different processor units (2-1,2-2,2-n) are always correct.