Fault resilient/fault tolerant computing
    2.
    发明授权
    Fault resilient/fault tolerant computing 失效
    故障恢复/容错计算

    公开(公告)号:US5600784A

    公开(公告)日:1997-02-04

    申请号:US405193

    申请日:1995-03-16

    摘要: In a first aspect, a method of synchronizing at least two computing elements that each have clocks that operate asynchronously of the clocks of the other computing elements includes selecting one or more signals, designated as meta time signals, from a set of signals produced by the computing elements, monitoring the computing elements to detect the production of a selected signal by one of the computing elements, waiting for the other computing elements to produce a selected signal, transmitting equally valued time updates to each of the computing elements, and updating the clocks of the computing elements based on the time updates.In a second aspect, fault resilient or fault tolerant computers are produced by designating a first processor as a computing element, designating a second processor as a controller, connecting the computing element and the controller to produce a modular pair, and connecting at least two modular pairs to produce a fault resilient or fault tolerant computer. Each computing element of the computer performs all instructions in the same number of cycles as the other computing elements.Computer systems include one or more controllers and at least two computing elements. System is provided for intercepting I/O operations by the computing elements and transmitting them to the one or more controllers.

    摘要翻译: 在第一方面中,一种同步至少两个计算元件的方法,每个计算元件具有与其它计算元件的时钟异步工作的时钟,包括从由所述另一个计算元件产生的一组信号中选择一个或多个指定为元时间信号的信号 计算元件,监视所述计算元件以通过所述计算元件之一检测所选择的信号的产生,等待所述其他计算元件产生所选择的信号,向所述计算元件中的每一个发送等价的时间更新,以及更新所述时钟 的计算元素基于时间更新。 在第二方面,通过将第一处理器指定为计算元件,指定作为控制器的第二处理器,连接计算元件和控制器以产生模块对,并连接至少两个模块化 成对产生故障恢复或容错计算机。 计算机的每个计算元件执行与其它计算元件相同数量的循环的所有指令。 计算机系统包括一个或多个控制器和至少两个计算元件。 提供系统用于通过计算元件截取I / O操作并将其发送到一个或多个控制器。

    Dual rail processors with error checking on I/O reads
    3.
    发明授权
    Dual rail processors with error checking on I/O reads 失效
    双轨处理器,对I / O读取进行错误检查

    公开(公告)号:US5249187A

    公开(公告)日:1993-09-28

    申请号:US357613

    申请日:1989-05-25

    摘要: A dual processor data processing system having interprocessor error checking includes a first central processing unit executing a series of instructions. A second central processing unit executes the same series of instructions independently of and in synchronism with the first central processing unit. A first data bus is coupled to the first central processing unit for receiving data to be input to the first central processing unit and a second data bus is coupled to the second central processing unit for receiving data to be input to the second central processing unit. Error checking devices are coupled to the first and second data busses for checking data transmitted over the first and second data busses and for detecting errors on I/O reads prior to delivery of the data to the first and second central processing units. The error checking devices include comparison means for indicating an error when the data on the first and second data busses are unequal. Error isolation devices are responsive to an error detected from the error checking means for analyzing the cause of error while maintaining system synchronization.

    摘要翻译: 具有处理器间错误检查的双处理器数据处理系统包括执行一系列指令的第一中央处理单元。 第二中央处理单元独立于第一中央处理单元执行相同的指令序列,并且与第一中央处理单元同步执行。 第一数据总线耦合到第一中央处理单元,用于接收要输入到第一中央处理单元的数据,第二数据总线耦合到第二中央处理单元,用于接收要输入到第二中央处理单元的数据。 错误检查设备被耦合到第一和第二数据总线,用于检查通过第一和第二数据总线传输的数据,并且用于在将数据传送到第一和第二中央处理单元之前检测I / O读取上的错误。 错误检查装置包括用于在第一和第二数据总线上的数据不相等时指示错误的比较装置。 错误隔离装置响应于从错误检查装置检测到的错误,用于在维护系统同步的同时分析错误原因。

    Dynamic Checkpointing Systems and Methods
    5.
    发明申请
    Dynamic Checkpointing Systems and Methods 有权
    动态检查点系统和方法

    公开(公告)号:US20150205671A1

    公开(公告)日:2015-07-23

    申请号:US14571383

    申请日:2014-12-16

    IPC分类号: G06F11/14

    CPC分类号: G06F11/1484

    摘要: A method for determining a delay in a dynamic, event driven, checkpoint interval. In one embodiment, the method includes the steps of determining the number of network bits to be transferred; determining the target bit transfer rate; calculating the next cycle delay as the number of bits to be transferred divided by the target bit transfer rate. In another aspect, the invention relates to a method for delaying a checkpoint interval. In one embodiment, the method includes the steps of monitoring the transfer of a prior batch of network data and delaying a subsequent checkpoint until the transfer of a prior batch of network data has reached a certain predetermined level of completion. In another embodiment, the predetermined level of completion is 100%.

    摘要翻译: 一种用于确定动态,事件驱动的检查点间隔中的延迟的方法。 在一个实施例中,该方法包括以下步骤:确定要传送的网络位数; 确定目标比特传输速率; 计算下一周期延迟作为要传输的位数除以目标位传输速率。 在另一方面,本发明涉及一种用于延迟检查点间隔的方法。 在一个实施例中,该方法包括以下步骤:监视先前批次的网络数据的传输并延迟后续的检查点,直到先前批次的网络数据的传送已经达到一定的预定的完成水平。 在另一个实施例中,预定的完成水平为100%。

    Dual-rail processors with error checking on I/O reads

    公开(公告)号:US4862465A

    公开(公告)日:1989-08-29

    申请号:US093495

    申请日:1987-09-04

    摘要: A dual processor data processing system having interprocessor error checking includes a first central processing unit executing a series of instructions. A second central processing unit executes the same series of instructions independently of and in synchronism with the first central processing unit. A first data bus is coupled to the first central processing unit for receiving data to be input to the first central processing unit and a second data bus is coupled to the second central processing unit for receiving data to be input to the second central processing unit. Error checking devices are coupled to the first and second data busses for checking data transmitted over the first and second data busses and for detecting errors on I/O reads prior to delivery of the data to the first and second central processing units. The error checking devices include comparison means for indicating an error when the data on the first and second data busses are unequal. Error isolation devices are responsive to an error detected from the error checking means for analyzing the cause of error while maintaining system synchronization.