摘要:
A method and apparatus for continued operation of a memory module, including a first and second memory device, when one of memory devices has failed. The method includes receiving a write operation request to write a data word, having first and second sections, by a first memory module. The memory module may have a first memory device and a second memory device, for respectively storing the first and second sections of the data word. A determination if one of the first and second memory devices is inoperable is made. If one of the first and second memory devices is inoperable, a write operation is performed by writing the first and second sections of the data word to the operable one of the first and second memory devices.
摘要:
A method and apparatus for continued operation of a memory module, including a first and second memory device, when one of memory devices has failed. The method includes receiving a write operation request to write a data word, having first and second sections, by a first memory module. The memory module may have a first memory device and a second memory device, for respectively storing the first and second sections of the data word. A determination if one of the first and second memory devices is inoperable is made. If one of the first and second memory devices is inoperable, a write operation is performed by writing the first and second sections of the data word to the operable one of the first and second memory devices.
摘要:
A method, system and computer program product implement memory performance management and enhanced memory reliability of a computer system accounting for system thermal conditions. When a primary memory temperature reaches an initial temperature threshold, reads are suspended to the primary memory and reads are provided to a mirrored memory in a mirrored memory pair, and writes are provided to both the primary memory and the mirrored memory. If the primary memory temperature reaches a second temperature threshold, write operations to the primary memory are also stopped and the primary memory is turned off with DRAM power saving modes such as self timed refresh (STR), and the reads and writes are limited to the mirrored memory in the mirrored memory pair. When the primary memory temperature decreases to below the initial temperature threshold, coherency is recovered by writing a coherent copy from the mirrored memory to the primary memory.
摘要:
A method, system and computer program product implement memory performance management and enhanced memory reliability of a computer system accounting for system thermal conditions. When a primary memory temperature reaches an initial temperature threshold, reads are suspended to the primary memory and reads are provided to a mirrored memory in a mirrored memory pair, and writes are provided to both the primary memory and the mirrored memory. If the primary memory temperature reaches a second temperature threshold, write operations to the primary memory are also stopped and the primary memory is turned off with DRAM power saving modes such as self timed refresh (STR), and the reads and writes are limited to the mirrored memory in the mirrored memory pair. When the primary memory temperature decreases to below the initial temperature threshold, coherency is recovered by writing a coherent copy from the mirrored memory to the primary memory.
摘要:
According to one embodiment of the present invention, a method for operating a three dimensional (“3D”) memory device includes detecting, by a memory controller, a first error on the 3D memory device and detecting, by the memory controller, a second error in a first chip in a first rank of the 3D memory device, wherein the first chip has an associated first chip select. The method also includes powering up a second chip in a second rank, sending a command from the memory controller to the 3D memory device to replace the first chip in the first chip select with the second chip and correcting the first error using an error control code.
摘要:
According to one embodiment of the present invention, a method for bank sparing in a 3D memory device that includes detecting, by a memory controller, a first error in the 3D memory device and detecting a second error in a first element in a first rank of the 3D memory device, wherein the first element in the first rank has an associated first chip select. The method also includes sending a command to the 3D memory device to set mode registers in a master logic portion of the 3D memory device that enable a second element to receive communications directed to the first element and wherein the second element is in a second rank of the 3D memory device, wherein the first element and second element are each either a bank or a bank group that comprise a plurality of chips.
摘要:
According to one embodiment of the present invention, a method for operating a three dimensional (“3D”) memory device includes detecting, by a memory controller, a first error on the 3D memory device and detecting, by the memory controller, a second error in a first chip in a first rank of the 3D memory device, wherein the first chip has an associated first chip select. The method also includes powering up a second chip in a second rank, sending a command from the memory controller to the 3D memory device to replace the first chip in the first chip select with the second chip and correcting the first error using an error control code.
摘要:
According to one embodiment of the present invention, a method for bank sparing in a 3D memory device that includes detecting, by a memory controller, a first error in the 3D memory device and detecting a second error in a first element in a first rank of the 3D memory device, wherein the first element in the first rank has an associated first chip select. The method also includes sending a command to the 3D memory device to set mode registers in a master logic portion of the 3D memory device that enable a second element to receive communications directed to the first element and wherein the second element is in a second rank of the 3D memory device, wherein the first element and second element are each either a bank or a bank group that comprise a plurality of chips.
摘要:
A method, system and computer program product are provided for implementing command timing adjustments to alleviate Dynamic Random Access Memory (DRAM) failures in a computer system. A predefined DRAM failure is detected. Responsive to the detected failure, a set of timers is adjusted for controlling predetermined timings used to access the DRAM. Responsive to the failure being resolved by the adjusted set of timers, checking for a predetermined level of performance is performed.
摘要:
A method, system and computer program product are provided for implementing hardware assisted Dynamic Random Access Memory (DRAM) repair in a computer system that supports ECC. A data register providing DRAM repair is selectively provided in one of the Dynamic Random Access Memory (DRAM), a memory controller, or a memory buffer coupled between the DRAM and the memory controller. The data register is configured to map to any address. Responsive to the configured address being detected, the reads to or the writes from the configured address are routed to the data register.