摘要:
A semiconductor memory circuit enabling replacement of defective memory elements and associated circuitry with non-defective spare elements of the RAM and associated circuitry, is scanned to enable replacement of a defective RAM element prior to repair of the RAM. A set of set/reset latches are coupled to receive the signal from the memory elements, and a multiplexer control circuit coupled to receive a shift signal and a ram_inhibit signal from a multiplexer to provide specific input signals to the multiplexer components. When a scan operation begins an active clock signal sets a set/reset latch to ram_inhibit mode and this blocks the memory elements from influencing the state of memory output latches, whereby when an memory operation begins, an active clocking signal will reset the set/reset latch into system mode to cause the multiplexers pass appropriate signals from the memory elements to the output latches, and the spare memory element is activated to replace a defective memory element.
摘要:
A semiconductor memory circuit enabling replacement of defective memory elements and associated circuitry with non-defective spare elements of the RAM and associated circuitry, is scanned to enable replacement of a defective RAM element prior to repair of the RAM. A set of set/reset latches are coupled to receive the signal from the memory elements, and a multiplexer control circuit coupled to receive a shift signal and a ram_inhibit signal from a multiplexer to provide specific input signals to the multiplexer components. When a scan operation begins an active clock signal sets a set/reset latch to ram_inhibit mode and this blocks the memory elements from influencing the state of memory output latches, whereby when an memory operation begins, an active clocking signal will reset the set/reset latch into system mode to cause the multiplexers pass appropriate signals from the memory elements to the output latches, and the spare memory element is activated to replace a defective memory element.
摘要:
Embodiments relate to a method including detecting a first error when reading a first cache line, recording a first address of the first error, detecting a second error when reading a second cache line and recording a second address of the second error. Embodiments also include comparing the first and second bitline address, comparing the first and second wordline address, activating a bitline delete mode based on matching first and second bitline addresses and not matching the first and second wordline addresses, detecting a third error when reading a third cache line, recording a third bitline address of the third error, comparing the second bitline address to a third bitline address and deleting a location corresponding to the third cache line from available cache locations based on the activated bitline delete mode and the third bitline address matching the second bitline address.
摘要:
Dynamic graduated memory device protection in redundant array of independent memory (RAIM) systems that include a plurality of memory devices is provided. A first severity level of a first failing memory device in the plurality of memory devices is determined. The first failing memory device is associated with an identifier used to communicate a location of the first failing memory device to an error correction code (ECC). A second severity level of a second failing memory device in the plurality of memory devices is determined. It is determined that the second severity level is higher than the first severity level. The identifier from the first failing memory device is removed based on determining that the second severity level is higher than the first severity level. The identifier is applied to the second failing memory device based on determining that the second severity level is higher than the first severity level.
摘要:
A computer-implemented method for verifying a RAIM/ECC design using a hierarchical injection scheme that includes selecting marks for generating an error mask, selecting a fixed bit mask based on the selected marks, determining whether to inject errors into at least one of a marked channel and at least one marked chip of a channel; and randomly injecting errors into the at least one of the marked channel and the at least one marked chip when determined.
摘要:
Error correction and detection in a redundant memory system that includes a memory controller; a plurality of memory channels in communication with the memory controller, the memory channels including a plurality of memory devices; a cyclical redundancy code (CRC) mechanism for detecting that one of the memory channels has failed, and for marking the memory channel as a failing memory channel; and an error correction code (ECC) mechanism. The ECC is configured for ignoring the marked memory channel and for detecting and correcting additional memory device failures on memory devices located on one or more of the other memory channels, thereby allowing the memory system to continue to run unimpaired in the presence of the memory channel failure.
摘要:
Providing homogeneous recovery in a redundant memory system that includes a memory controller, a plurality of memory channels in communication with the memory controller, an error detection code mechanism configured for detecting a failing memory channel, and an error recovery mechanism. The error recovery mechanism is configured for receiving notification of the failing memory channel, for blocking off new operations from starting on the memory channels, for completing any pending operations on the memory channels, for performing a recovery operation on the memory channels and for starting the new operations on at least a first subset of the memory channels. The memory system is capable of operating with the first subset of the memory channels.
摘要:
Providing heterogeneous recovery in a redundant memory system that includes a memory controller, a plurality of memory channels in communication with the memory controller, an error detection code mechanism configured for detecting a failing memory channel, and an error recovery mechanism. The error recovery mechanism is configured for receiving notification of the failing memory channel, for performing a recovery operation on the failing memory channel while other memory channels are performing normal system operations, for bringing the recovered channel back into operational mode with the other memory channels for store operations, for continuing to mark the recovered channel to guard against stale data, for removing any stale data after the recovery operation is complete, and for removing the mark on the recovered channel to allow the normal system operations with all of the memory channels, the removing in response to the removing any stale data being complete.
摘要:
A new multi nodal computer system comprising a number of nodes on which chips of different types reside. The new multi nodal computer system is characterized in that there is one clock chip per node, each clock chip controlling only the chips residing on that node said chips being appropriate for sending a check stop request to the associated clock chip in case of a malfunction. A new check stop handling method is characterized in that depending on the source of the check stop request the clock chip that received the check stop request initiates a system check stop, a node check up, or a chip check stop.
摘要:
A system for processing errors in a processor comprising, an error counter, a pass counter, and a processing portion operative to determine whether a first error is active, increment an error counter responsive to determining that the first error is active, increment the pass counter responsive to determining that all errors have been checked, and clear the error counter responsive to determining that the pass counter is greater than or equal to a pass count threshold value.