摘要:
One embodiment provides an error detection method wherein single-bit errors in a memory module are detected and identified as being a random error or a repeat error. Each identified random error and each identified repeat error occurring in a time interval is counted. An alert is generated in response to a number of identified random errors reaching a random-error threshold or a number of identified repeat errors reaching a repeat-error threshold during the predefined interval. The repeat-error threshold is set lower than the random-error threshold. A hashing process may be applied to the memory address of each detected error to map the location of the error in the memory system to a corresponding location in an electronic table.
摘要:
One embodiment provides an error detection method wherein single-bit errors in a memory module are detected and identified as being a random error or a repeat error. Each identified random error and each identified repeat error occurring in a time interval is counted. An alert is generated in response to a number of identified random errors reaching a random-error threshold or a number of identified repeat errors reaching a repeat-error threshold during the predefined interval. The repeat-error threshold is set lower than the random-error threshold. A hashing process may be applied to the memory address of each detected error to map the location of the error in the memory system to a corresponding location in an electronic table.
摘要:
One embodiment provides a method for scalable predictive failure analysis. Embodiments of the method may include gathering memory information for memory on a user computer system having at least one processor. Further, the method includes selecting one or more memory-related parameters. Further still, the method includes calculating based on the gathering and the selecting, a single bit error value for the scalable predictive failure analysis through calculations for each of the one or more memory-related parameters that utilize the memory information. Yet further, the method includes setting, based on the calculating, the single bit error value for the user computer system.