-
公开(公告)号:US20220229714A1
公开(公告)日:2022-07-21
申请号:US17516584
申请日:2021-11-01
Applicant: Intel Corporation
Inventor: Gaurav PORWAL , Subhankar PANDA , John G. HOLM
IPC: G06F11/07
Abstract: Upon occurrence of multiple errors in a central processing unit (CPU) package, data indicating the errors is stored in machine check (MC) banks. A timestamp corresponding to each error is stored, the timestamp indicating a time of occurrence for each error. A machine check exception (MCE) handler is generated to address the errors based on the timestamps. The timestamps can be stored in the MC banks or in a utility box (U-box). The MCE handler can then address the errors based on order of occurrence, for example by determining that the first error in time causes the remaining error. The MCE can isolate hardware/software associated with the first error to recover from a failure. The MCE can report only the first error to the operating system (OS) or other error management software/hardware. The U-Box may also convert the timestamps into real time to support user debugging.
-
公开(公告)号:US20240211344A1
公开(公告)日:2024-06-27
申请号:US18025868
申请日:2020-09-26
Applicant: Intel Corporation
Inventor: Kuljit S. BAINS , Kjersten E. CRISS , Rajat AGARWAL , Omar AVELAR SUAREZ , Subhankar PANDA , Theodros YIGZAW , Rebecca Z. LOOP , John G. HOLM , Gaurav PORWAL
CPC classification number: G06F11/106 , G11C29/02
Abstract: A memory subsystem with error checking and scrubbing (ECS) logic on-device on the memory can adapt the rate of ECS operations in response to detection of errors in the memory when the memory device is in automatic ECS mode. The ECS logic can include an indication of rows of memory that have been offlined by the host. The ECS logic can skip the offlined rows in ECS operation counts. The ECS logic can include requests or hints by the host to have ECS operations performed. An internal address generator of the ECS logic can select between generated addresses and the hints. The system can allow a memory controller to detect multibit errors (MBEs) related to a specific address of the associated memory. When the detected MBEs indicate a pattern of errors, the memory controller triggers a row hammer response for the specific address.
-
3.
公开(公告)号:US20240061741A1
公开(公告)日:2024-02-22
申请号:US18268956
申请日:2020-12-26
Applicant: Intel Corporation
Inventor: Rajat AGARWAL , Hsing-Min CHEN , Wei P. CHEN , Wei WU , Jing LING , Kuljit S. BAINS , Kjersten E. CRISS , Deep K. BUCH , Theodros YIGZAW , John G. HOLM , Andrew M. RUDOFF , Vaibhav SINGH , Sreenivas MANDAVA
IPC: G06F11/10
CPC classification number: G06F11/10
Abstract: A memory subsystem includes memory devices with space dynamically allocated for improvement of reliability, availability, and serviceability (RAS) in the system. Error checking and correction (ECC) logic detects an error in all or a portion of a memory device. In response to error detection, the system can dynamically perform one or more of: allocate active memory device space for sparing to spare a failed memory segment; write a poison pattern into a failed cacheline to mark it as failed; perform permanent fault detection (PFD) and adjust application of ECC based on PFD detection; or, spare only a portion of a device and leave another portion active, including adjusting ECC based on the spared portion. The error detection can be based on bits of an ECC device, and error correction based on those bits and additional bits stored on the data devices.
-
公开(公告)号:US20220350500A1
公开(公告)日:2022-11-03
申请号:US17855688
申请日:2022-06-30
Applicant: Intel Corporation
Inventor: Wei P. CHEN , Theodros YIGZAW , Sarathy JAYAKUMAR , Anthony LUCK , Deep K. BUCH , Rajat AGARWAL , Kuljit S. BAINS , John G. HOLM , Brent CHARTRAND , Keith KLAYMAN
IPC: G06F3/06
Abstract: An apparatus is described. The apparatus includes a processor. The processor includes a memory controller to read and write from a memory. The memory controller includes error correction coding (ECC) circuitry to correct errors in data read from the memory. The processor includes register space to track read data error information. The processor includes an embedded controller. The processor includes local memory coupled to the embedded controller. The embedded controller is to read the read data error information and store the read data error information in the local memory.
-
公开(公告)号:US20200301773A1
公开(公告)日:2020-09-24
申请号:US16866485
申请日:2020-05-04
Applicant: Intel Corporation
Inventor: Gaurav PORWAL , Subhankar PANDA , John G. HOLM
IPC: G06F11/07
Abstract: Upon occurrence of multiple errors in a central processing unit (CPU) package, data indicating the errors is stored in machine check (MC) banks. A timestamp corresponding to each error is stored, the timestamp indicating a time of occurrence for each error. A machine check exception (MCE) handler is generated to address the errors based on the timestamps. The timestamps can be stored in the MC banks or in a utility box (U-box). The MCE handler can then address the errors based on order of occurrence, for example by determining that the first error in time causes the remaining error. The MCE can isolate hardware/software associated with the first error to recover from a failure. The MCE can report only the first error to the operating system (OS) or other error management software/hardware. The U-Box may also convert the timestamps into real time to support user debugging.
-
-
-
-