Diagnosing hardware faults in a data storage system
    1.
    发明授权
    Diagnosing hardware faults in a data storage system 有权
    诊断数据存储系统中的硬件故障

    公开(公告)号:US08407527B1

    公开(公告)日:2013-03-26

    申请号:US12494467

    申请日:2009-06-30

    IPC分类号: G06F11/00

    CPC分类号: G06F11/076 G06F11/0727

    摘要: Hardware faults in data storage systems are diagnosed. User I/O errors are received. Disk drive port error counters, primary port error counters, and expansion port error counters are read. A user I/O error threshold is modified based on the error counter readings. Depending on the type of errors counted, the user I/O error threshold may be increased or decreased. Once a first quantity of user I/O errors exceeds the modified user I/O error threshold, a faulty component is identified.

    摘要翻译: 诊断数据存储系统中的硬件故障。 接收到用户I / O错误。 读取磁盘驱动器端口错误计数器,主端口错误计数器和扩展端口错误计数器。 基于错误计数器读数修改用户I / O错误阈值。 根据计数的错误类型,可以增加或减少用户I / O错误阈值。 一旦第一数量的用户I / O错误超出修改后的用户I / O错误阈值,就会识别故障组件。

    Method for automatically diagnosing hardware faults in a data storage system
    2.
    发明授权
    Method for automatically diagnosing hardware faults in a data storage system 有权
    自动诊断数据存储系统硬件故障的方法

    公开(公告)号:US07779306B1

    公开(公告)日:2010-08-17

    申请号:US11690147

    申请日:2007-03-23

    IPC分类号: G06F11/00

    摘要: A method for automatically diagnosing faults a data storage system. The system includes a plurality of enclosures each having: a primary port; an expansion port; a plurality of disk drives; and a link control card coupled to the primary port and to the expansion port and the plurality of disk drives. The link control card includes a cut through switch having: disk drive port error counters for counting at ports of the plurality of disk drives; a primary port error counter for counting cumulative errors at the primary port, and an expansion port error counter for counting cumulative errors at the expansion port. The primary ports and expansion ports are serially interconnected to the storage processor through a fiber channel loop. The method sequentially reads counters in each one of the enclosures to determine whether errors counted in any one of such counters exceeds a predetermined threshold over a predetermined period of time.

    摘要翻译: 一种自动诊断数据存储系统故障的方法。 该系统包括多个外壳,每个外壳具有:主端口; 一个扩展端口; 多个磁盘驱动器; 以及链接控制卡,其耦合到主端口以及扩展端口和多个盘驱动器。 链路控制卡包括切割开关,具有:用于在多个盘驱动器的端口进行计数的盘驱动器端口错误计数器; 用于计算主端口累积错误的主端口错误计数器和用于计算扩展端口累积错误的扩展端口错误计数器。 主端口和扩展端口通过光纤通道环路与存储处理器串联连接。 该方法顺序地读取每个外壳中的计数器,以确定在任何一个这样的计数器中计数的错误是否在预定时间段内超过预定阈值。