-
公开(公告)号:US20220326860A1
公开(公告)日:2022-10-13
申请号:US17850455
申请日:2022-06-27
Applicant: Intel Corporation
Inventor: Jun LI , Subhankar PANDA , Gaurav PORWAL , Feiting WANYAN
IPC: G06F3/06
Abstract: A dedicated bank-based error counter is provided for a respective bank of a Dynamic Random Access Memory (DRAM). The dedicated bank-based error counter for the bank is stored in memory resources. A Basic Input/Output System (BIOS) System Management Interrupt (SMI) handler triggers Adaptive Double Device Data Correction (ADDDC) bank sparing if the error count for the respective bank equals or exceeds a per bank ADDDC threshold.
-
公开(公告)号:US20220229714A1
公开(公告)日:2022-07-21
申请号:US17516584
申请日:2021-11-01
Applicant: Intel Corporation
Inventor: Gaurav PORWAL , Subhankar PANDA , John G. HOLM
IPC: G06F11/07
Abstract: Upon occurrence of multiple errors in a central processing unit (CPU) package, data indicating the errors is stored in machine check (MC) banks. A timestamp corresponding to each error is stored, the timestamp indicating a time of occurrence for each error. A machine check exception (MCE) handler is generated to address the errors based on the timestamps. The timestamps can be stored in the MC banks or in a utility box (U-box). The MCE handler can then address the errors based on order of occurrence, for example by determining that the first error in time causes the remaining error. The MCE can isolate hardware/software associated with the first error to recover from a failure. The MCE can report only the first error to the operating system (OS) or other error management software/hardware. The U-Box may also convert the timestamps into real time to support user debugging.
-
公开(公告)号:US20240211344A1
公开(公告)日:2024-06-27
申请号:US18025868
申请日:2020-09-26
Applicant: Intel Corporation
Inventor: Kuljit S. BAINS , Kjersten E. CRISS , Rajat AGARWAL , Omar AVELAR SUAREZ , Subhankar PANDA , Theodros YIGZAW , Rebecca Z. LOOP , John G. HOLM , Gaurav PORWAL
CPC classification number: G06F11/106 , G11C29/02
Abstract: A memory subsystem with error checking and scrubbing (ECS) logic on-device on the memory can adapt the rate of ECS operations in response to detection of errors in the memory when the memory device is in automatic ECS mode. The ECS logic can include an indication of rows of memory that have been offlined by the host. The ECS logic can skip the offlined rows in ECS operation counts. The ECS logic can include requests or hints by the host to have ECS operations performed. An internal address generator of the ECS logic can select between generated addresses and the hints. The system can allow a memory controller to detect multibit errors (MBEs) related to a specific address of the associated memory. When the detected MBEs indicate a pattern of errors, the memory controller triggers a row hammer response for the specific address.
-
公开(公告)号:US20230205626A1
公开(公告)日:2023-06-29
申请号:US18116785
申请日:2023-03-02
Applicant: Intel Corporation
CPC classification number: G06F11/1064 , G06F11/0787 , G06F11/076
Abstract: Multilevel memory error management techniques can improve system performance, availability, and reliability by preventing future accesses to faulty near memory locations. According to examples described herein, multilevel memory error management techniques enable proactively offlining far memory locations mapped to a faulty near memory location before additional faults are encountered, and/or maintaining a faulty near memory location list to enable bypassing the faulty near memory location to prevent future errors.
-
公开(公告)号:US20200301773A1
公开(公告)日:2020-09-24
申请号:US16866485
申请日:2020-05-04
Applicant: Intel Corporation
Inventor: Gaurav PORWAL , Subhankar PANDA , John G. HOLM
IPC: G06F11/07
Abstract: Upon occurrence of multiple errors in a central processing unit (CPU) package, data indicating the errors is stored in machine check (MC) banks. A timestamp corresponding to each error is stored, the timestamp indicating a time of occurrence for each error. A machine check exception (MCE) handler is generated to address the errors based on the timestamps. The timestamps can be stored in the MC banks or in a utility box (U-box). The MCE handler can then address the errors based on order of occurrence, for example by determining that the first error in time causes the remaining error. The MCE can isolate hardware/software associated with the first error to recover from a failure. The MCE can report only the first error to the operating system (OS) or other error management software/hardware. The U-Box may also convert the timestamps into real time to support user debugging.
-
-
-
-