Computer system implementing fault detection and isolation using unique identification codes stored in non-volatile memory
    1.
    发明授权
    Computer system implementing fault detection and isolation using unique identification codes stored in non-volatile memory 有权
    计算机系统使用存储在非易失性存储器中的唯一识别码实现故障检测和隔离

    公开(公告)号:US06463550B1

    公开(公告)日:2002-10-08

    申请号:US09267587

    申请日:1999-03-12

    IPC分类号: G06F1134

    摘要: A computer system implementing a fault detection and isolation technique tracks failed physical devices by error codes embedded in various component in the computer system. The computer system comprises one or more CPU's, one or more memory modules, a master control device, such as an I2C master, and a North bridge logic device coupling together the CPU's, memory modules, and master control device. The master control device also connects to the CPU's and memory modules over a serial bus, such as an I2C bus. Each component includes a nonvolatile memory coupled to the I2C bus for storing error information. If a component fails, a CPU stores an error code into the nonvolatile memory via the I2C bus. During initialization, the CPU creates a logical resource map which includes a list of logical addresses of all available (i.e., fully functional) devices. The logical resource map is provided to the computer's operating system which isolates failed devices by only permitting access to those logical devices listed as available. The computer may include a non-volatile memory device coupled to the CPU for storing a failed device log which includes a list of ID codes corresponding to failed physical devices. After a device is determined to be non-functional, one of the CPU's stores that device's unique ID code in the failed device log. During system initialization, the information in the failed device log is compared to the error information stored in the components to create the logical resource map.

    摘要翻译: 实施故障检测和隔离技术的计算机系统通过计算机系统中嵌入在各种组件中的错误代码跟踪故障物理设备。 计算机系统包括一个或多个CPU,一个或多个存储器模块,诸如I2C主机的主控制装置以及将CPU,存储器模块和主控制装置耦合在一起的北桥逻辑装置。 主控制器还通过串行总线(如I2C总线)连接到CPU和内存模块。 每个组件包括耦合到I2C总线的非易失性存储器,用于存储错误信息。 如果组件发生故障,CPU将错误代码通过I2C总线存储到非易失性存储器中。 在初始化期间,CPU创建逻辑资源映射,其包括所有可用(即完全功能的)设备的逻辑地址的列表。 逻辑资源映射被提供给计算机的操作系统,其通过仅允许访问列出的可用的逻辑设备来隔离故障设备。 计算机可以包括耦合到CPU的非易失性存储器设备,用于存储故障设备日志,其包括与故障物理设备相对应的ID代码的列表。 在设备被确定为非功能之后,其中一个CPU将设备的唯一ID代码存储在故障设备日志中。 在系统初始化期间,将故障设备日志中的信息与存储在组件中的错误信息进行比较,以创建逻辑资源映射。

    Computer system implementing fault detection and isolation using unique identification codes stored in non-volatile memory
    2.
    发明授权
    Computer system implementing fault detection and isolation using unique identification codes stored in non-volatile memory 失效
    计算机系统使用存储在非易失性存储器中的唯一识别码实现故障检测和隔离

    公开(公告)号:US06496945B2

    公开(公告)日:2002-12-17

    申请号:US09090123

    申请日:1998-06-04

    IPC分类号: G06F1100

    摘要: A computer system implementing a fault detection and isolation technique that tracks failed physical devices by identification (ID) codes embedded in each component of the computer for which the ability to detect faults and isolate failed devices is disclosed. The computer system comprises one or more CPU's, one or more memory modules, a master control device, such as an I2C master, and a North bridge logic device coupling together the CPU's, memory modules, and master control device. The master control device also connects to the CPU's and memory modules over a serial bus, such as an I2C bus. Each CPU and memory module includes an ID code that uniquely identifies and distinguishes that device from all other devices in the computer system. The computer also includes a non-volatile memory device coupled to the CPU for storing a failed device log which includes a list of ID codes corresponding to failed physical devices. After a device is determined to be non-functional, one of the CPU's stores that device's unique ID code in the failed device log. Using the list of physical devices from the failed device log, the CPU creates a logical resource map which includes a list of logical addresses of all available (i.e., fully functional) devices. The logical resource map is provided to the computer's operating system which isolates failed devices by only permitting access to those logical devices listed as available in the logical resource map.

    摘要翻译: 公开了一种实现故障检测和隔离技术的计算机系统,其通过嵌入在计算机的每个组件中的识别(ID)代码来跟踪故障物理设备,该识别(ID)代码具有检测故障和隔离故障设备的能力。 计算机系统包括一个或多个CPU,一个或多个存储器模块,诸如I2C主机的主控制装置以及将CPU,存储器模块和主控制装置耦合在一起的北桥逻辑装置。 主控制器还通过串行总线(如I2C总线)连接到CPU和内存模块。 每个CPU和存储器模块都包含一个ID代码,用于唯一地识别和区分该设备与计算机系统中的所有其他设备。 计算机还包括耦合到CPU的非易失性存储器设备,用于存储故障设备日志,其包括与故障物理设备相对应的ID代码的列表。 在设备被确定为非功能之后,其中一个CPU将设备的唯一ID代码存储在故障设备日志中。 使用故障设备日志中的物理设备列表,CPU创建逻辑资源映射,其包括所有可用(即完全功能)设备的逻辑地址列表。 逻辑资源映射被提供给计算机的操作系统,其通过仅允许访问逻辑资源映射中可用的那些逻辑设备来隔离故障设备。