Abstract:
A method for operating a memory system includes receiving thermal data indicating a temperature at addresses in a memory array, and a write request associate with data. An address of the write request is decoded. It is determined whether a temperature at the address of the write request is above a threshold temperature. The data is sent to a short latency write queue responsive to determining that the temperature is not above the threshold temperature.
Abstract:
A three-dimensional (3D) integrated circuit (IC) device can include a first die having a first supply line and a second die having a second supply line, a power header, and a voltage selection logic. The power header can be connected to the first die and the second die and configured to generate a first voltage on a first voltage line and a second voltage on a second voltage line. The voltage selection logic can be connected to the first supply line and the second supply line and configured to select between the first voltage line and the second voltage line for each of the first supply line and the second supply line.
Abstract:
A memory controller may receive a plurality of thermal profiles from a plurality of three-dimensional (3D)-stacked memory chips, where the plurality of thermal profiles include thermal profile data for the memory chips, where the thermal profile data includes a memory chip usage data and a location data for each of the memory chips, and where the memory chips include a first memory chip and a second memory chip. The memory controller may generate a first predicted memory chip usage data and location data by analyzing the usage data and location data of the thermal profile data. A second predicted memory chip usage data and location data may be generated. Based on the predicted memory chip, fractional memory chip read propensity data may be generated. The memory controller may distribute, according the first fractional memory chip read propensity distribution, memory chip read operations.
Abstract:
According to one aspect, a method for performance optimization of read functions in a memory system includes receiving, at the memory system, a read request including a logical address of a target data. The memory system includes a primary memory and a back-up memory that mirrors the primary memory. The method also includes searching a fault monitor table for an entry corresponding to the received logical address. The fault monitor table includes a plurality of entries that indicate physical locations of identified memory failure events in the primary memory and the back-up memory. Based on locating an entry corresponding to the received logical address, the method further includes selecting one of the primary memory and the backup memory for retrieving the target data. The selection is based on contents of the fault monitor table.
Abstract:
A method, system and computer program product for implementing thermal air flow control management of a computer system. A temperature profile of the server system is identified. One or more dual in-line memory-modules (DIMMs) are used to pivot on an axis to direct air flow to cool identified hot spots based upon the temperature profile of the server system.
Abstract:
A method, system and computer program product are provided for implementing enhanced reliability of memory subsystems utilizing a dual port Dynamic Random Access Memory (DRAM) configuration. The DRAM configuration includes a first buffer and a second buffer, each buffer including a validity counter. The validity counter for a receiving buffer is incremented as each respective data row from a transferring buffer is validated through Error Correction Code (ECC), Reliability, Availability, and Serviceability (RAS) logic and transferred to the receiving buffer, while the validity counter for the transferring buffer is decremented. Data are read from or written to either the first buffer or the second buffer based upon a respective count value of the validity counters.
Abstract:
A method for mirroring in three-dimensional-stacked memory includes receiving a plurality of thermal profiles from a plurality of memory chips. The method also includes ranking the plurality of memory chips in a first ranked list of memory chips as a function of the plurality of thermal profiles and forming a first group of memory chips from the plurality of memory chips based on the first ranked list of memory chips. The method also includes forming a second group of memory chips from the plurality of memory chips distinct from the first group of memory chips based on the first ranked list of memory chips. The method also includes pairing a first memory chip from the first group of memory chips and a second memory chip from the second group of memory chips, and mirroring the pairing of memory chips.
Abstract:
A method for testing a stacked memory device having a plurality of memory chips connected to and arranged on top of a logic chip for a connection defect is disclosed. The method may include testing a memory chip by writing a data value into a first location in the memory chip, reading a data value from the first location, detecting a first bit error and recording a bit number of the first bit error. The method may also include testing the memory chip by writing a data value into a second location in the memory chip, reading a data value from the second location in the memory chip, detecting a second bit error and recording a bit number of the second bit error. The method may also include replacing a connection common to the first and second bit errors with a spare connection.
Abstract:
A method for testing a stacked memory device having a plurality of memory chips connected to and arranged on top of a logic chip for a connection defect is disclosed. The method may include testing a memory chip by writing a data value into a first location in the memory chip, reading a data value from the first location, detecting a first bit error and recording a bit number of the first bit error. The method may also include testing the memory chip by writing a data value into a second location in the memory chip, reading a data value from the second location in the memory chip, detecting a second bit error and recording a bit number of the second bit error. The method may also include replacing a connection common to the first and second bit errors with a spare connection.
Abstract:
By arranging dies in a stack such that failed cores are aligned with adjacent good cores, fast connections between good cores and cache of failed cores can be implemented. Cache can be allocated according to a priority assigned to each good core, by latency between a requesting core and available cache, and/or by load on a core.