-
公开(公告)号:US11360842B2
公开(公告)日:2022-06-14
申请号:US17187111
申请日:2021-02-26
发明人: Gang Song
摘要: In a fault processing method, when it is determined that a computer crashes, a baseboard management controller in the computer can send a read request message to a processor in the computer, where the read request message is used for requesting reading of first error data recorded by the processor, receive a read response message returned by the processor, and obtain, according to the read response message, the first error data recorded by the processor.
-
公开(公告)号:US20190332453A1
公开(公告)日:2019-10-31
申请号:US16509218
申请日:2019-07-11
发明人: Gang Song
IPC分类号: G06F11/07
摘要: A fault processing method, a related apparatus, and a computer. When it is determined that a computer crashes, a baseboard management controller in the computer can send a read request message to a processor in the computer, where the read request message is used for requesting reading of error data recorded by the processor, receive a read response message returned by the processor, and obtain, according to the read response message, the error data recorded by the processor. By means of the embodiments of the present invention, an operating system does not need to be used, acquisition of error data in a computer after the computer crashes is implemented using a baseboard management controller, and a problem in the prior art that error data in a computer cannot be acquired after a severe uncorrectable error occurring in the computer causes a system crash is resolved.
-
公开(公告)号:US10353763B2
公开(公告)日:2019-07-16
申请号:US15385701
申请日:2016-12-20
发明人: Gang Song
摘要: A fault processing method, a related apparatus, and a computer. When it is determined that a computer crashes, a baseboard management controller in the computer can send a read request message to a processor in the computer, where the read request message is used for requesting reading of first error data recorded by the processor, receive a read response message returned by the processor, and obtain, according to the read response message, the first error data recorded by the processor. By means of the embodiments of the present invention, an operating system does not need to be used, acquisition of error data in a computer after the computer crashes is implemented using a baseboard management controller, and a problem in the prior art that error data in a computer cannot be acquired after a severe uncorrectable error occurring in the computer causes a system crash is resolved.
-
公开(公告)号:US11119874B2
公开(公告)日:2021-09-14
申请号:US16748274
申请日:2020-01-21
发明人: Gang Song , Chengguo Ding , Fei Zhang
IPC分类号: G06F11/22
摘要: A memory fault detection method includes: receiving a first interrupt signal sent when a count value of a first leaky bucket counter of a server reaches a first threshold; disabling an interrupt switch of the first leaky bucket counter; enabling the interrupt switch of the first leaky bucket counter after the interrupt switch of the first leaky bucket counter has been disabled for a preset time and the count value of the first leaky bucket counter is reset to zero; receiving a second interrupt signal sent when a count value of a second leaky bucket counter reaches a second threshold; if the second leaky bucket counter and the first leaky bucket counter are a same leaky bucket counter, and the second rank and a first rank are a same rank, determining that a hardware fault occurs in the first rank.
-
公开(公告)号:US20200159635A1
公开(公告)日:2020-05-21
申请号:US16748274
申请日:2020-01-21
发明人: Gang Song , Chengguo Ding , Fei Zhang
IPC分类号: G06F11/22
摘要: A memory fault detection method includes: receiving a first interrupt signal sent when a count value of a first leaky bucket counter of a server reaches a first threshold; disabling an interrupt switch of the first leaky bucket counter; enabling the interrupt switch of the first leaky bucket counter after the interrupt switch of the first leaky bucket counter has been disabled for a preset time and the count value of the first leaky bucket counter is reset to zero; receiving a second interrupt signal sent when a count value of a second leaky bucket counter reaches a second threshold; if the second leaky bucket counter and the first leaky bucket counter are a same leaky bucket counter, and the second rank and a first rank are a same rank, determining that a hardware fault occurs in the first rank.
-
公开(公告)号:US20170102985A1
公开(公告)日:2017-04-13
申请号:US15385701
申请日:2016-12-20
发明人: Gang Song
IPC分类号: G06F11/07
CPC分类号: G06F11/079 , G06F11/0706 , G06F11/0751 , G06F11/0772 , G06F11/0778 , G06F11/0793
摘要: A fault processing method, a related apparatus, and a computer. When it is determined that a computer crashes, a baseboard management controller in the computer can send a read request message to a processor in the computer, where the read request message is used for requesting reading of first error data recorded by the processor, receive a read response message returned by the processor, and obtain, according to the read response message, the first error data recorded by the processor. By means of the embodiments of the present invention, an operating system does not need to be used, acquisition of error data in a computer after the computer crashes is implemented using a baseboard management controller, and a problem in the prior art that error data in a computer cannot be acquired after a severe uncorrectable error occurring in the computer causes a system crash is resolved.
-
公开(公告)号:US20210182136A1
公开(公告)日:2021-06-17
申请号:US17187111
申请日:2021-02-26
发明人: Gang Song
IPC分类号: G06F11/07
摘要: In a fault processing method, when it is determined that a computer crashes, a baseboard management controller in the computer can send a read request message to a processor in the computer, where the read request message is used for requesting reading of first error data recorded by the processor, receive a read response message returned by the processor, and obtain, according to the read response message, the first error data recorded by the processor.
-
公开(公告)号:US10430260B2
公开(公告)日:2019-10-01
申请号:US15709824
申请日:2017-09-20
发明人: Gang Song
IPC分类号: G06F11/00 , G06F11/07 , G06F9/48 , G06F13/24 , G06F9/4401
摘要: A troubleshooting method implemented by a processor device is provided, comprising determining, according to collected information of correctable errors, that a correctable error storm has occurred, disabling a system management interrupt (SMI) of generation modules of correctable errors in a correctable error set, wherein the correctable error set comprises correctable errors related to the correctable error storm, sending SMI-disabled notification information to a baseboard management controller (BMC), receiving enable-SMI notification information that is sent by the BMC after a predetermined time elapses after the SMI-disabled notification information has been received, and enabling the disabled SMI of the generation modules of the correctable errors according to the enable-SMI notification information.
-
-
-
-
-
-
-