摘要:
A new method and apparatus have been taught for storing error information used for debugging as generated by the initial and subsequent error occurrences. In this invention, a register with several bit ranges is used to store error information. The first bit-range is allocated to the initial error information. If the total number of the errors exceeds the capacity of the register, the last error is kept in a last bit-range. This way, precious initial error information (as well as the last error information) will be available for debugging.
摘要:
A system, method, and computer program product for handling shared cache lines to allow forward progress among processors in a multi-processor environment is provided. A counter and a threshold are provided a processor of the multi-processor environment, such that the counter is incremented for every exclusive cross interrogate (XI) reject that is followed by an instruction completion, and reset on an exclusive XI acknowledgement. If the XI reject counter reaches a preset threshold value, the processor's pipeline is drained by blocking instruction issue and prefetching attempts, creating a window for an exclusive XI from another processor to be honored, after which normal instruction processing is resumed. Configuring the preset threshold value as a programmable value allows for fine-tuning of system performance.
摘要:
There is provided a method, system and computer program product for generating trace data related to a data processing system event. The method includes: receiving an instruction relating to the system event from a location in the system; determining a minimum number of trace segment records required to record instruction information; and creating a trace segment table including the number of trace segment records, the number of trace segment records including at least one instruction record.
摘要:
A system and method for avoiding deadlocks when performing storage updates in a multi-processor environment. The system includes a processor having a local cache, a store queue having a temporary buffer with capability to reject exclusive cross-interrogates (XI) while an interrogated cache line is owned exclusive and is to be stored, and a mechanism for performing a method. The method includes setting the processor into a slow mode. A current instruction that includes a data store having one or more target lines is received. The current instruction is executed, with the executing including storing results associated with the data store into the temporary buffer. The store queue is prevented from rejecting an exclusive XI corresponding to the target lines of the current instruction. Each target line is acquired with a status of exclusive ownership, and the contents from the temporary buffer are written to each target line after instruction completion.
摘要:
A system, method, and computer program product for enhancing timeliness of cache memory prefetching in a processing system are provided. The system includes a stride pattern detector to detect a stride pattern for a stride size in an amount of bytes as a difference between successive cache accesses. The system also includes a confidence counter. The system further includes eager prefetching control logic for performing a method when the stride size is less than a cache line size. The method includes adjusting the confidence counter in response to the stride pattern detector detecting the stride pattern, comparing the confidence counter to a confidence threshold, and requesting a cache prefetch in response to the confidence counter reaching the confidence threshold. The system may also include selection logic to select between the eager prefetching control logic and standard stride prefetching control logic.
摘要:
A system, method, and computer program product for handling shared cache lines to allow forward progress among processors in a multi-processor environment is provided. A counter and a threshold are provided a processor of the multi-processor environment, such that the counter is incremented for every exclusive cross interrogate (XI) reject that is followed by an instruction completion, and reset on an exclusive XI acknowledgement. If the XI reject counter reaches a preset threshold value, the processor's pipeline is drained by blocking instruction issue and prefetching attempts, creating a window for an exclusive XI from another processor to be honored, after which normal instruction processing is resumed. Configuring the preset threshold value as a programmable value allows for fine-tuning of system performance.
摘要:
A system, method and computer program product for sampling computer system performance data are provided. The system includes a sample buffer to store instrumentation data while capturing trace data in a trace array, where the instrumentation data enables measurement of computer system performance. The system further includes a sample interrupt generator to assert a sample interrupt indicating that the instrumentation data is available to read. The sample interrupt is asserted in response to storing the instrumentation data in the sample buffer.
摘要:
To support a new processor control bit, the Real Space Control (RSC) bit, in a processor system with an existing translation lookaside buffer, an existing control bit, the Private Space bit, in the translation lookaside buffer is redefined as an Ignore Common segment bit to create new non-overlapping translation lookaside buffer entries.
摘要:
A computer processor has an I-unit (instruction unit) and instruction decoder, an E-unit (execution unit), a Buffer Control Element (BCE) containing a unified two-way interleaved L1 cache and providing write control to said two-way interleaved L1 cache. The processor has Double Word wide execution dataflow. An instruction decoder receiving instruction data from a unified cache before decoding causes, for stores, I-unit logic to initiate a request ahead of execution to tell the buffer control element that stores will be made from the E-unit, and E-unit logic sends a store request to initiate a store after decoding corresponding instruction data which indicates what address in the cache the DoubleWord data is to be stored to. In the process, E-unit logic calculates, from source and destination address information address ranges information in an instruction, whether a corresponding multi-Double Word store with same byte data will result from the data patterns, and, when a multi-Double Word store could result, it enables the E-unit to request the writing of an incoming Double Word on the computer's data bus for both Double Word L1 cache interleaves using the same address for both to effectively write two consecutively addressed DoubleWords for the same cycle to achieve a Quad Word store in a cycle.
摘要:
A pipelined processor includes circuitry adapted for store forwarding, including: for each store request, and while a write to one of a cache and a memory is pending; obtaining the most recent value for at least one block of data; merging store data from the store request with the block of data thus updating the block of data and forming a new most recent value and an updated complete block of data; and buffering the updated block of data into a store data queue; for each additional store request, where the additional store request requires at least one updated block of data: determining if store forwarding is appropriate for the additional store request on a block-by-block basis; if store forwarding is appropriate, selecting an appropriate block of data from the store data queue on a block-by-block basis; and forwarding the selected block of data to the additional store request.