摘要:
Method and system of determining whether a user program has made a system level call and thus whether the user program is uncooperative with fault tolerant operation. Some exemplary embodiments may be a processor-based method comprising providing information from a first processor to a second processor (the information indicating that a user program executed on the first processor has not made a system level call in a predetermined amount of time), and determining by the first processor, using information from the second processor, whether a duplicate copy of the user program substantially simultaneously executed in the second processor has made a system level call in the predetermined amount of time.
摘要:
A method and system of loosely lock-stepped non-deterministic processors. Some exemplary embodiments may be a processor-based method comprising executing fault tolerant copies of a user program, one copy of the user program executed in a first processor performing non-deterministic execution, and a duplicate copy of the user program executing in a second processor performing non-deterministic execution, with the executing in the first processor and second processor not in cycle-by-cycle lock-stepped.
摘要:
A method and system of copying a memory area between processor elements for lock-step execution. At least some of the illustrative embodiments may be a method comprising executing duplicate copies of a first program in a first processor of a first multiprocessor computer system and in a first processor of a second multiprocessor computer system (the executing substantially in lock-step), executing a second program in a second processor element of the first multiprocessor computer system (the first and second processors of the first multiprocessor computer system sharing an input/output (I/O) bridge), copying a memory area of the second program executing in the second processor element of the first multiprocessor computer system to a memory of a second processor element in the second multiprocessor computer system while the duplicate copies of the first program are executing in the first processor elements, and then executing duplicate copies of the second program in the second processors in lock-step.
摘要:
A method and system of copying memory from a source processor to a target processor by duplicating memory writes. At least some of the exemplary embodiments may be a method comprising stopping execution of a user program on a target processor (the target processor coupled to a first memory), continuing to execute a duplicate copy of the user program on a source processor (the source processor coupled to a second memory and generating writes to the second memory), duplicating memory writes of the source processor and duplicating writes by input/output adapters to create a stream of duplicate memory writes, and applying the duplicated memory writes to the first memory.
摘要:
A method and system of implementing a persistent memory. At least some of the illustrative embodiments are a system comprising a first computer slice comprising a memory, a second computer slice comprising a memory (the second computer slice coupled to the first computer slice by way of a communication network at least partially external to each computer slice), and a persistent memory comprising at least a portion of the memory of each computer slice (the portion of the memory of the first computer slice storing a duplicate copy of data stored in the portion of the memory of the second computer slice). The persistent memory is accessible to an application program through the communication network.
摘要:
A multiprocessor system includes a number of sub-processor systems, each substantially identically constructed, and each comprising a central processing unit (CPU), and at least one I/O device, interconnected by routing apparatus that also interconnects the sub-processor systems. A CPU of any one of the sub-processor systems may communicate, through the routing elements, with any I/O device of the system, or with any CPU of the system. Communications between I/O devices and CPUs is by packetized messages. Interrupts from I/O devices are communicated from the I/O devices to the CPUs (or from one CPU to another CPU) as message packets. CPUs and I/O devices may write to, or read from, memory of a CPU of the system. Memory protection is provided by an access validation method maintained by each CPU in which CPUs and/or I/O devices are provided with a validation to read/write memory of that CPU, without which memory access is denied.
摘要:
A processing system includes a number of communicatively interconnected system elements structured to send and receive data in the form of message packets. Message packets sent to a destination with expectation of response are timed, and if no response is received within an allotted time, a barrier transaction message packet is sent to the destination. The destination is required to provide a barrier transaction response to the barrier transaction packet only after it has responded to, or discarded, all prior received message packets requiring response by the destination. When the source of the barrier transaction message packet receives the barrier transaction response it can be assured that the communication path to the destination is in order, and no prior (late) responses will be forthcoming.
摘要:
A multiprocessor system includes a number of sub-processor systems, each substantially identically constructed, and each comprising a central processing unit (CPU), and at least one I/O device, interconnected by routing apparatus that also interconnects the sub-processor systems. A CPU of any one of the sub-processor systems may communicate, through the routing elements, with any I/O device of the system, or with any CPU of the system. The CPUs are structured to operate in one of two modes: a simplex mode in which the two CPUs operate independently of each other, and a duplex mode in which the CPUs operate in lock-step synchronism to execute each instruction of identical instruction streams at substantially the same time. Communications between I/O devices and CPUs is by packetized messages. Interrupts from I/O devices are communicated from the I/O devices to the CPUs (or from one CPU to another CPU) as message packets. CPUs and I/O devices may write to, or read from, memory of a CPU of the system. Memory protection is provided by an access validation method maintained by each CPU in which CPUs and/or I/O devices are provided with a validation to read/write memory of that CPU, without which memory access is denied.
摘要:
A method and system of exchanging information between processors. At least some of the illustrative embodiments may be a method comprising exchanging information between a plurality of processors by writing (by a first processor) a first datum to a logic device and then continuing processing of a user program by the first processor, writing (by a second processor) a second datum to the logic device and then continuing processing of a user program by the second processor, and writing (by the logic device) the first and second datum to each of the first and second processors after all the processors have written their respective datum to the logic device.
摘要:
Adaptive sets of lanes are configured between routers in a system area network. Source nodes determine whether packets may be adaptively routed between the lanes by encoding adaptive control bits in the packet header. The adaptive control bits also facilitate the flushing of all lanes of the adaptive set. Adaptive sets may also be used in uplinks between levels of a fat tree.