摘要:
In-band firmware executes instructions which cause commands to be sent on a coherency fabric. Fabric snoop logic monitors the coherency fabric for command packets that target a resource in one of the support chips attached via an FSI link. Conversion logic converts the information from the fabric packet into an FSI protocol. An FSI command is transmitted via the FSI transmit link to an FSI slave of the intended support chip. An FSI receive link receives response data from the FSI slave of the intended support chip. Conversion logic converts the information from the support chip received via the FSI receive link into the fabric protocol. Response packet generation logic generates the fabric response packet and returns it on the coherency fabric. An identical FSI link between a support processor and support chips allows direct access to the same resources on the support chips by out-of-band firmware.
摘要:
A method and apparatus are provided for a support interface for memory-mapped resources. A support processor sends a sequence of commands over and FSI interface to a memory-mapped support interface on a processor chip. The memory-mapped support interface updates memory, memory-mapped registers or memory-mapped resources. The interface uses fabric packet generation logic to generate a single command packet in a protocol for the coherency fabric which consists of an address, command and/or data. Fabric commands are converted to FSI protocol and forwarded to attached support chips to access the memory-mapped resource, and responses from the support chips are converted back to fabric response packets. Fabric snoop logic monitors the coherency fabric and decodes responses for packets previously sent by fabric packet generation logic. The fabric snoop logic updates status register and/or writes response data to a read data register. The system also reports any errors that are encountered.
摘要:
A computer implemented method, a data processing system, and a computer usable program code for automatically identifying multiple combinations of operational and non-operational components with a single part number. A non-volatile storage is provided on a part, wherein the part includes a plurality of sub-components. Unavailable sub-components in the plurality of sub-components are identified based on a series of testing to form identified unavailable sub-components. Information of the identified unavailable sub-components is stored into the non-volatile storage.
摘要:
A system and method of recovering from errors in a data processing system. The data processing system includes one or more processor cores coupled to one or more memory controllers. The one or more memory controllers include at least a first memory interface coupled to a first memory and at least a second memory interface coupled to a second memory. In response to determining an error has been detected in the first memory, access to the first memory via the first memory interface is inhibited. Also, the first memory interface is locally restarted without restarting the second memory interface.
摘要:
A method, apparatus, and computer program product are disclosed in a data processing system for providing a virtualized time base in a logically partitioned data processing system. A time base is determined for each one of multiple processor cores. The time base is used to indicate a current time to one of the processor cores for which the time base is determined. The time bases are synchronized together for the processor cores such that each one of the processor cores includes its own copy of a synchronized time base. For one of the processor cores, a virtualized time base is generated that is different from the synchronized time base but that remains synchronized with at least a portion of the synchronized time base. The processor core utilizes the virtualized time base instead of the synchronized time base for indicating the current time to the processor core. The synchronized time bases and the portion of the virtualized time base remaining in synchronization together.
摘要:
A method and system is presented for correcting a data error in a primary Dynamic Random Access Memory (DRAM) in a Dual In-line Memory Module (DIMM). Each DRAM has a left half (for storing bits 0:3) and a right half (for storing bits 4:7). A determination is made as to whether the data error was in the left or right half of the primary DRAM. The half of the primary DRAM in which the error occurred is removed from service. All subsequent reads and writes for data originally stored in the primary DRAM's defective half are made to a half of a spare DRAM in the DIMM, while the DRAM's non-defective half continues to be used for subsequently storing data.
摘要:
Mechanism for accurately measuring useful capacity of a processor allocated to each thread in a simultaneously multi-threading data processing system. Instructions dispatched from multiple threads are executed by the processor on a same clock cycle. A determination is made whether Time Base (TB) register bit (60) is changing. A dispatch charge value is determined for each thread, and added to the Processor Utilization Resource Register for each thread when TB bit (60) changes.
摘要:
A method in a data processing system for avoiding a microprocessor's design defects and recovering a microprocessor from failing due to design defects, the method comprised of the following steps: The method detects and reports of events which warn of an error. Then the method locks a current checkpointed state and prevents instructions not checkpointed from checkpointing. After that, the method releases checkpointed state stores to a L2 cache, and drops stores not checkpointed. Next, the method blocks interrupts until recovery is completed. Then the method disables the power savings states throughout the processor. After that, the method disables an instruction fetch and an instruction dispatch. Next, the method sends a hardware reset signal. Then the method restores selected registers from the current checkpointed state. Next, the method fetches instructions from restored instruction addresses. Then the method resumes a normal execution after a programmable number of instructions.
摘要:
Redundant time-of-day (TOD) oscillators are aligned, within a master oscillator path, to local logic oscillator and used to create independent step-sync signals. A step checker validates and provides selection signals to identify which of the TOD oscillators operates according to a criterion. Independent step-sync signals are transmitted to several sibling chips. Local step and sync signals are delayed to arrive at TOD register nearly synchronous with TOD registers in sibling chips. A slave oscillator path may be used to select time signals generated in a sibling chip, whereby the master oscillator path is deselected. A primary control register set may be used to configure which among several chips is a master chip using the master oscillator path. All remaining chips are slave chips. All segments of the topology are redundant. One of multiple possible alternate topologies is defined in a secondary control register set. Commands and TOD values are passed on the fabric at predefined time increment boundaries to establish, restore, or maintain synchronization across all chips.
摘要:
A method for creating precise exceptions including checkpointing an exception causing instruction. The checkpointing results in a current checkpointed state. The current checkpointed state is locked. It is determined if any of a plurality of registers require restoration to the current checkpointed state. One or more of the registers are restored to the current checkpointed state in response to the results of the determining indicating that the one or more registers require the restoring. The execution unit is restarted at the exception handler or the next sequential instruction dependent on whether traps are enabled for the exception.