-
公开(公告)号:US09645857B2
公开(公告)日:2017-05-09
申请号:US12641001
申请日:2009-12-17
CPC分类号: G06F9/5061 , G06F9/22 , G06F9/44
摘要: In accordance with at least some embodiments, a system includes a plurality of partitions, each partition having its own operating system (OS) and workload. The system also includes a plurality of resources assignable to the plurality of partitions. The system also includes management logic coupled to the plurality of partitions and the plurality of resources. The management logic is configured to set priority rules for each of the plurality of partitions based on user input. The management logic performs automated resource fault management for the resources assigned to the plurality of partitions based on the priority rules.
-
公开(公告)号:US20110154097A1
公开(公告)日:2011-06-23
申请号:US12641072
申请日:2009-12-17
CPC分类号: G06F11/079 , G06F11/0727 , G06F11/0748 , G06F11/1428 , G06F11/202
摘要: A system and method for fault management in a computer-based system are disclosed herein. A system includes a plurality of field replaceable units (“FRUs”) and fault management logic. The fault management logic is configured to collect error information from a plurality of components of the system. The logic stores, for each component identified as a possible cause of a detected fault, a record assigning one of two different component failure probability indications. The logic identifies a single of the plurality of FRUs that has failed based on the stored probability indications.
摘要翻译: 本文公开了一种用于基于计算机的系统中的故障管理的系统和方法。 系统包括多个现场可更换单元(“FRU”)和故障管理逻辑。 故障管理逻辑被配置为从系统的多个部件收集错误信息。 对于被识别为检测到的故障的可能原因的每个组件,逻辑存储器分配两个不同组件故障概率指示之一的记录。 该逻辑基于所存储的概率指示来识别已经发生故障的多个FRU中的单个。
-
公开(公告)号:US08108724B2
公开(公告)日:2012-01-31
申请号:US12641072
申请日:2009-12-17
IPC分类号: G06F11/00
CPC分类号: G06F11/079 , G06F11/0727 , G06F11/0748 , G06F11/1428 , G06F11/202
摘要: A system and method for fault management in a computer-based system are disclosed herein. A system includes a plurality of field replaceable units (“FRUs”) and fault management logic. The fault management logic is configured to collect error information from a plurality of components of the system. The logic stores, for each component identified as a possible cause of a detected fault, a record assigning one of two different component failure probability indications. The logic identifies a single of the plurality of FRUs that has failed based on the stored probability indications.
摘要翻译: 本文公开了一种用于基于计算机的系统中的故障管理的系统和方法。 系统包括多个现场可更换单元(“FRU”)和故障管理逻辑。 故障管理逻辑被配置为从系统的多个部件收集错误信息。 对于被识别为检测到的故障的可能原因的每个组件,逻辑存储器分配两个不同组件故障概率指示之一的记录。 该逻辑基于所存储的概率指示来识别已经发生故障的多个FRU中的单个。
-
公开(公告)号:US20110154349A1
公开(公告)日:2011-06-23
申请号:US12641001
申请日:2009-12-17
IPC分类号: G06F9/50
CPC分类号: G06F9/5061 , G06F9/22 , G06F9/44
摘要: In accordance with at least some embodiments, a system includes a plurality of partitions, each partition having its own operating system (OS) and workload. The system also includes a plurality of resources assignable to the plurality of partitions. The system also includes management logic coupled to the plurality of partitions and the plurality of resources. The management logic is configured to set priority rules for each of the plurality of partitions based on user input. The management logic performs automated resource fault management for the resources assigned to the plurality of partitions based on the priority rules.
摘要翻译: 根据至少一些实施例,系统包括多个分区,每个分区具有其自己的操作系统(OS)和工作负载。 该系统还包括可分配给多个分区的多个资源。 该系统还包括耦合到多个分区和多个资源的管理逻辑。 管理逻辑被配置为基于用户输入来为多个分区中的每一个设置优先权规则。 管理逻辑基于优先级规则对分配给多个分区的资源执行自动资源故障管理。
-
公开(公告)号:US08161324B2
公开(公告)日:2012-04-17
申请号:US12641091
申请日:2009-12-17
申请人: Howard Calkin , Andrew C. Walton
发明人: Howard Calkin , Andrew C. Walton
IPC分类号: G06F11/00
CPC分类号: G06F11/0751 , G06F11/0727
摘要: A system and method for recording fault information in an electronic system are disclosed herein. A system includes fault analysis logic and a plurality of field replaceable units (“FRUs”). The fault analysis is configured to analyze system error information, and identify at least one of the FRUs in the system to be a possible cause of a detected fault based on the analysis. Each FRU includes writeable non-volatile storage including storage locations reserved to store information including a result of the analysis. The result of the analysis indicates a reason that the FRU storing the information was determined, by the fault analysis logic, to be a possible cause of the fault.
摘要翻译: 本文公开了一种在电子系统中记录故障信息的系统和方法。 系统包括故障分析逻辑和多个现场可更换单元(“FRU”)。 故障分析被配置为分析系统错误信息,并且基于分析将系统中的至少一个FRU识别为检测到的故障的可能原因。 每个FRU包括可写入的非易失性存储器,包括保存用于存储包括分析结果的信息的存储位置。 分析结果表明存在信息的FRU由故障分析逻辑确定为故障的可能原因。
-
公开(公告)号:US08122290B2
公开(公告)日:2012-02-21
申请号:US12641103
申请日:2009-12-17
IPC分类号: G06F11/00
CPC分类号: G06F11/0766 , G06F11/0712 , G06F11/0724 , G06F11/079
摘要: A system for error log consolidation is disclosed herein. A server computer includes a plurality of system processors and error log consolidation logic. The system processors are configurable to form isolated execution partitions. The error log consolidation logic is configured to, based on detection of a fault in the server, retrieve error logs from the system processors, and to consolidate the retrieved logs with server computer information not available to the system processors to generate a consolidated error log. The consolidated error log includes a comprehensive set of server information relevant to identifying a cause of the detected fault.
摘要翻译: 本文公开了用于错误日志整合的系统。 服务器计算机包括多个系统处理器和错误日志合并逻辑。 系统处理器可配置为形成隔离的执行分区。 错误日志整合逻辑被配置为基于检测到服务器中的故障,从系统处理器中检索错误日志,并将检索到的日志与系统处理器不可用的服务器计算机信息合并,以生成统一的错误日志。 统一的错误日志包括与识别检测到的故障原因相关的全套服务器信息。
-
公开(公告)号:US08713350B2
公开(公告)日:2014-04-29
申请号:US12633648
申请日:2009-12-08
IPC分类号: G06F11/00
CPC分类号: G06F11/0793 , G06F11/0712 , G06F11/0766
摘要: A method of managing errors in a data processing system may involve at least one computer system. Each computer system may include a processor that executes an operating system, firmware, and system memory storing instructions for the operating system. A firmware error handler resident in the firmware may identify an error occurring in the computer system. The firmware error handler may determine whether the operating system is required to take an action in response to the error. If the operating system is not required to take an action in response to the error, the firmware error handler may create an error log accessible to the operating system appropriate to cause the operating system to take no action.
摘要翻译: 管理数据处理系统中的错误的方法可以涉及至少一个计算机系统。 每个计算机系统可以包括执行存储操作系统的指令的操作系统,固件和系统存储器的处理器。 驻留在固件中的固件错误处理程序可能会识别计算机系统中发生的错误。 固件错误处理程序可以确定操作系统是否需要采取响应错误的动作。 如果操作系统不需要采取措施来响应错误,则固件错误处理程序可能会创建适用于使操作系统不采取任何操作的操作系统可访问的错误日志。
-
公开(公告)号:US08839032B2
公开(公告)日:2014-09-16
申请号:US13258392
申请日:2009-12-08
IPC分类号: G06F11/07
CPC分类号: G06F11/0784 , G06F11/0712 , G06F11/079 , G06F11/0793
摘要: A method of managing errors in a data processing system (10) may involve at least one computer system (14). Each computer system (14) may include a plurality of hardware components (18), including a processor (20) for executing a respective operating system and a memory (22) for storing instructions for the respective operating system (24), and firmware (28) including a firmware error handler (30). For each computer system (14), the firmware error handler (30) may identify an error occurring in one of the hardware components (18). Each respective firmware error handler (30) may communicate error information about the identified error to an error manager (32) external of the computer system (14). The error manager (14) may compile the error information communicated from each respective firmware error handler (30).
摘要翻译: 管理数据处理系统(10)中的错误的方法可以包括至少一个计算机系统(14)。 每个计算机系统(14)可以包括多个硬件组件(18),包括用于执行相应操作系统的处理器(20)和用于存储相应操作系统(24)的指令的存储器(22)和固件 28),其包括固件错误处理程序(30)。 对于每个计算机系统(14),固件错误处理器(30)可以识别在硬件组件(18)之一上发生的错误。 每个相应的固件错误处理器(30)可以将关于所识别的错误的错误信息传送到计算机系统(14)的外部的错误管理器(32)。 错误管理器(14)可以编译从每个相应的固件错误处理器(30)传送的错误信息。
-
公开(公告)号:US08151147B2
公开(公告)日:2012-04-03
申请号:US12640971
申请日:2009-12-17
IPC分类号: G06F11/00
CPC分类号: G06F11/0793 , G06F11/0709
摘要: In accordance with at least some embodiments, a system comprises a plurality of partitions, each partition having its own error handler. The system further comprises a plurality of resources assignable to the plurality of partitions. The system further comprises management logic coupled to the plurality of partitions and the plurality of resources. The management logic comprises an error management tool that synchronizes operation of the error handlers in response to an error.
摘要翻译: 根据至少一些实施例,系统包括多个分区,每个分区具有其自己的错误处理程序。 该系统还包括可分配给多个分区的多个资源。 该系统还包括耦合到多个分区和多个资源的管理逻辑。 管理逻辑包括错误管理工具,该错误管理工具使错误处理程序的响应于错误的操作同步。
-
公开(公告)号:US20110154128A1
公开(公告)日:2011-06-23
申请号:US12640971
申请日:2009-12-17
IPC分类号: G06F11/07
CPC分类号: G06F11/0793 , G06F11/0709
摘要: In accordance with at least some embodiments, a system comprises a plurality of partitions, each partition having its own error handler. The system further comprises a plurality of resources assignable to the plurality of partitions. The system further comprises management logic coupled to the plurality of partitions and the plurality of resources. The management logic comprises an error management tool that synchronizes operation of the error handlers in response to an error.
摘要翻译: 根据至少一些实施例,系统包括多个分区,每个分区具有其自己的错误处理程序。 该系统还包括可分配给多个分区的多个资源。 该系统还包括耦合到多个分区和多个资源的管理逻辑。 管理逻辑包括错误管理工具,该错误管理工具使错误处理程序的响应于错误的操作同步。
-
-
-
-
-
-
-
-
-