Digital data processing methods and apparatus for fault detection and
fault tolerance
    1.
    发明授权
    Digital data processing methods and apparatus for fault detection and fault tolerance 失效
    用于故障检测和容错的数字数据处理方法和装置

    公开(公告)号:US5630056A

    公开(公告)日:1997-05-13

    申请号:US309210

    申请日:1994-09-20

    IPC分类号: G06F13/00 G06F11/16 G06F11/00

    CPC分类号: G06F11/1625 G06F11/1633

    摘要: A digital data processing device includes a bus for transmitting signals (e.g., data and/or address information) between plural functional units (e.g., a central processing unit and a peripheral controller). A first such unit includes first and second processing sections that concurrently apply to the bus complementary portions of like information signals (e.g., longwords containing data). A fault detection element reads the resultant signal from the bus and compares it with at least portions of the corresponding signals originally generated by the processing sections themselves. If there is discrepancy, the fault-detector signals a fault, e.g., causing the unit to be taken off-line. By use of a redundant unit, processing can continue for fault-tolerant operation.

    摘要翻译: 数字数据处理装置包括用于在多个功能单元(例如,中央处理单元和外围控制器)之间传送信号(例如,数据和/或地址信息)的总线。 第一这样的单元包括同时应用于类似信息信号的总线互补部分(例如,包含数据的长字)的第一和第二处理部分。 故障检测元件从总线读取合成信号并将其与最初由处理部分本身产生的对应信号的至少部分进行比较。 如果存在差异,则故障检测器发出故障信号,例如使单元离线。 通过使用冗余单元,处理可以继续进行容错操作。

    Fault-tolerant computer system employing an improved error-broadcast
mechanism
    2.
    发明授权
    Fault-tolerant computer system employing an improved error-broadcast mechanism 失效
    容错计算机系统采用改进的错误广播机制

    公开(公告)号:US5555372A

    公开(公告)日:1996-09-10

    申请号:US360414

    申请日:1994-12-21

    IPC分类号: G06F11/20 G06F11/14 G06F11/30

    CPC分类号: G06F11/2007

    摘要: A bus device (10) the communicates with other bus devices (12, 13) on a communication channel (14) that includes a plurality of duplicated information buses (16, 17) selectively assumes bus-selection states in which it uses information from one or the other of the buses (16, 17). It also monitors the buses (16, 17) for errors in the information that the buses (16, 17) carry, and it broadcasts an error signal over other lines (18) of the communications channel (14) in response to detection of such an error, but only if an error occurs in information on the bus that its current bus-selection state designates. On the other hand, when an error-broadcast signal indicating an error on either bus in the information transmitted by that device (10) appears on the bus, that bus device (10) retransmits the information, regardless of that device's current bus-selection state. Inconsistent operation phasing among bus devices that have assumed different bus-selection states is thereby avoided.

    摘要翻译: 总线设备(10)与包括多个复制信息总线(16,17)的通信信道(14)上的其他总线设备(12,13)进行通信,所述通信信道选择性地假定总线选择状态,其中使用来自一个 或其他公共汽车(16,17)。 它还监视总线(16,17)对总线(16,17)携带的信息中的错误,并且响应于这样的检测而在通信信道(14)的其他线路(18)上广播误差信号 一个错误,但只有当总线选择状态指定的总线上的信息发生错误。 另一方面,当在总线上出现指示由该设备(10)发送的信息中的任一总线上的错误的错误广播信号时,该总线设备(10)重新发送信息,而不管该设备的当前总线选择 州。 因此避免了已经采用不同总线选择状态的总线设备之间的不一致的操作定相。

    Digital data processing methods and apparatus for fault detection and
fault tolerance
    3.
    发明授权
    Digital data processing methods and apparatus for fault detection and fault tolerance 失效
    用于故障检测和容错的数字数据处理方法和装置

    公开(公告)号:US5838900A

    公开(公告)日:1998-11-17

    申请号:US759099

    申请日:1996-12-03

    IPC分类号: G06F13/00 G06F11/16 G06F11/00

    CPC分类号: G06F11/1625 G06F11/1633

    摘要: A digital data processing device includes a bus for transmitting signals (e.g., data and/or address information) between plural functional units (e.g., a central processing unit and a peripheral controller). A first such unit includes first and second processing sections that concurrently apply to the bus complementary portions of like information signals (e.g., longwords containing data). A fault detection element reads the resultant signal from the bus and compares it with at least portions of the corresponding signals originally generated by the processing sections themselves. If there is discrepancy, the fault-detector signals a fault, e.g., causing the unit to be taken off-line. By use of a redundant unit, processing can continue for fault-tolerant operation.

    摘要翻译: 数字数据处理装置包括用于在多个功能单元(例如,中央处理单元和外围控制器)之间传送信号(例如,数据和/或地址信息)的总线。 第一这样的单元包括同时应用于类似信息信号的总线互补部分(例如,包含数据的长字)的第一和第二处理部分。 故障检测元件从总线读取合成信号并将其与最初由处理部分本身产生的对应信号的至少部分进行比较。 如果存在差异,则故障检测器发出故障信号,例如使单元离线。 通过使用冗余单元,处理可以继续进行容错操作。

    Digital data processing methods and apparatus for fault isolation
    4.
    发明授权
    Digital data processing methods and apparatus for fault isolation 失效
    用于故障隔离的数字数据处理方法和装置

    公开(公告)号:US5838899A

    公开(公告)日:1998-11-17

    申请号:US658563

    申请日:1996-06-05

    CPC分类号: G06F11/1625 G06F11/1633

    摘要: A fault-isolating digital data processing apparatus includes plural functional units that are interconnected for point-to-point communications by a plurality of buses. The functional units monitor the buses to which they are attached and signal the other units in the event there are bus communication errors. The functional units can simultaneously enter into an error isolation phase, e.g., in response to a bus error signaled by one of the units. During this phase, each unit transmits test data (e.g., predetermined patterns of O's and 1's) onto at least one of its attached buses. The functional units continue to monitor the buses and to signal bus errors while the test data is being transmitted. In addition to signaling bus errors, the functional units can signal unit-level (or "board") faults when they detect fault in their own operation. To this end, each unit includes error isolation functionality that signals a fault based on (i) whether that unit signaled a loopback error with respect to its own operation; (ii) whether that unit or another unit signaled a bus error during the error isolation phase; and/or (iii) whether any other functional unit signaled that it was faulty during the error isolation phase.

    摘要翻译: 故障隔离数字数据处理装置包括通过多个总线进行点对点通信而互连的多个功能单元。 在发生总线通信错误的情况下,功能单元监视它们所连接的总线并发信号通知其他单元。 功能单元可以同时进入错误隔离阶段,例如响应于由其中一个单元发出的总线错误。 在该阶段期间,每个单元将测试数据(例如,O和1的预定模式)发送到其连接的总线的至少一个上。 功能单元继续监视总线,并在发送测试数据时发出总线错误信号。 除了信令总线错误之外,当功能单元在自己的操作中检测到​​故障时,可以发出单元级(或“板”)故障信号。 为此,每个单元包括基于(i)该单元是否发出相对于其自身操作的环回错误来发信号通知故障的错误隔离功能; (ii)该单元还是另一个单元在错误隔离阶段发出总线错误信号; 和/或(iii)是否有任何其他功能单元在错误隔离阶段发出信号是否有故障。