Abstract:
Memory subsystem error management enables dynamically changing lockstep partnerships. A memory subsystem has a lockstep partnership relationship between a first memory portion and a second memory portion to spread error correction over the pair of memory resources. The lockstep partnership can be preconfigured. In response to detecting a hard error in the lockstep partnership, the memory subsystem can cancel or reverse the lockstep partnership between the first memory portion and the second memory portion and create or set a new lockstep partnership. The detected error can be a second hard error in the lockstep partnership. The memory subsystem can create new lockstep partnerships between the first memory portion and a third memory portion as lockstep partners and between the second memory portion and a fourth memory portion as lockstep partners. The memory subsystem can also be configured to change the granularity of the lockstep partnership when changing partnerships.
Abstract:
An apparatus and method are described for performing forward and reverse memory sparing operations. For example, one embodiment of a processor comprises memory sparing logic to perform a first forward memory sparing operation at a first level of granularity in response to detecting a memory failure; the memory sparing logic to perform a reverse memory sparing operation in response to a determination of an improved sparing state having a second level of granularity; and the memory sparing logic to responsively perform a second forward memory sparing operation at the second level of granularity.
Abstract:
A method performed by a memory controller is described. The method includes, during boot up, issuing a command to a memory to cause the memory to zero out its content. The method also includes bypassing a descrambler when reading from a location in the memory that has not had its zeroed out content written over the scrambled data. The method also includes processing read data with the descrambler when reading from a location in the memory that has had its zeroed out content written over with scrambled data.
Abstract:
Techniques and mechanisms to provide selective access to data error information by a memory controller. In an embodiment, a memory device stores a first value representing a baseline number of data errors determined prior to operation of the memory device with the memory controller. Error detection logic of the memory device determines a current count of data errors, and calculates a second value representing a difference between the count of data errors and the baseline number of data errors. The memory device provides the second value to the memory controller, which is unable to identify that the second value is a relative error count. In another embodiment, the memory controller is restricted from retrieving the baseline number of data errors.
Abstract:
Error correction in a memory subsystem includes a memory device generating internal check bits after performing internal error detection and correction, and providing the internal check bits to the memory controller. The memory device performs internal error detection to detect errors in read data in response to a read request from the memory controller. The memory device selectively performs internal error correction if an error is detected in the read data. The memory device generates check bits indicating an error vector for the read data after performing internal error detection and correction, and provides the check bits with the read data to the memory controller in response to the read request. The memory controller can apply the check bits for error correction external to the memory device.
Abstract:
Provided are an apparatus and method to store data from a cache line at locations having errors in a sparing directory. In response to a write operation having write data for locations in one of the cache lines, the write data for a location in the cache line having an error is written to an entry in a sparing directory including an address of the cache line.
Abstract:
Embodiments of the present disclosure describe methods, apparatus, and system configurations for providing rank-specific cyclic redundancy checks in memory systems.
Abstract:
Provided are an apparatus and method to store data from a cache line at locations having errors in a sparing directory. In response to a write operation having write data for locations in one of the cache lines, the write data for a location in the cache line having an error is written to an entry in a sparing directory including an address of the cache line.
Abstract:
Memory subsystem error management enables dynamically changing lockstep partnerships. A memory subsystem has a lockstep partnership relationship between a first memory portion and a second memory portion to spread error correction over the pair of memory resources. The lockstep partnership can be preconfigured. In response to detecting a hard error in the lockstep partnership, the memory subsystem can cancel or reverse the lockstep partnership between the first memory portion and the second memory portion and create or set a new lockstep partnership. The detected error can be a second hard error in the lockstep partnership. The memory subsystem can create new lockstep partnerships between the first memory portion and a third memory portion as lockstep partners and between the second memory portion and a fourth memory portion as lockstep partners. The memory subsystem can also be configured to change the granularity of the lockstep partnership when changing partnerships.
Abstract:
Error correction in a memory subsystem includes a memory device generating internal check bits after performing internal error detection and correction, and providing the internal check bits to the memory controller. The memory device performs internal error detection to detect errors in read data in response to a read request from the memory controller. The memory device selectively performs internal error correction if an error is detected in the read data. The memory device generates check bits indicating an error vector for the read data after performing internal error detection and correction, and provides the check bits with the read data to the memory controller in response to the read request. The memory controller can apply the check bits for error correction external to the memory device.