摘要:
A method, system and apparatus for instruction tracing with out of order speculative processors. With the present invention, information corresponding to the state of an instruction cache and a data cache is stored in a trace storage device along with information corresponding to instructions fetched by the processor. When a cache load is necessary, updated cache information is stored in the trace storage device. Thereby, the state of the cache at all times during fetching of instructions may be known from the information stored in the trace storage device. Additionally, the particular instructions fetched is known from the fetched instructions information stored in the trace storage device. Hence the instruction stream may be reconstructed from the information stored in the trace storage device.
摘要:
The problem identified above is addressed in large part by a microprocessor as disclosed herein. The microprocessor includes a dispatch unit configured to receive a set of instructions from an instruction cache and to forward the set of instructions to an issue queue when the instructions are ready for execution. The dispatch unit may include sampling logic that is configured to select one of the instructions for performance monitoring from the set of instructions. The microprocessor further includes a performance monitor unit enabled to monitor performance characteristics of the selected instruction as it executes. The sampling logic may identify the instruction selected for monitoring as the instruction occupying an eligible position within the set of instructions. The eligible position from which the monitored instruction is selected may vary with each subsequent set of instructions. The sampling logic may include a selection mask that contains an asserted bit that identifies the position within the set of instructions from which the selected instruction is chosen. The selection mask may include a single bit for each position in the set of instructions and may be implemented as a shift register that periodically rotates the eligible position. The rotation of the eligible bit position may occur every clock cycle, every dispatch cycle, or at some another suitable synchronous or asynchronous interval. The selection mask may contain multiple asserted bits and may include a filter circuit that generates a selection vector based on the selection mask where the selection vector includes only a single asserted bit.
摘要:
The present invention discloses a system and method for implementing instruction tracing in a computer system and in particular a computer system with a tightly coupled shared processor central processor unit (CPU). Each of the processors are generally purpose processors that have been modified by design to allow an instruction to execute and simultaneously to be stored and forwarded to shared memory operable as a trace buffer. Since each processor is general purpose, the trace routine necessary for tracing, can by one of the routines or programs that can be written and executed on either of the processors. One of the processors can run, collect and analyze the executed and store instructions of the other processor. Since the processors can be on a single chip the shared memory bus that writes and reads the executed instructions can operate at high speed. Also since the trace function is part of the multiprocessor architecture its speed of operation will scale with the speed of the processors without modification.
摘要:
A method and system for debugging the execution of an instruction within an instruction pipeline is provided. A processor in a data processing system contains instruction pipeline units. An instruction may be tagged, and in response to an instruction pipeline unit completing its processing of the tagged instruction, a stage completion signal is asserted. An execution monitor external to the pipelined processor monitors the stage completion signals during the execution of the tagged instruction. The execution monitor may be a logic analyzer that displays the stage completion signals in real-time on a display device of the execution monitor. An instruction to be tagged may be selected based upon an instruction selection rule, such as the address of the instruction.
摘要:
A data processing system includes a processor having a first level cache and a prefetch engine. Coupled to the processor are a second level cache and a third level cache and a system memory. Prefetching of cache lines is performed into each of the first, second, and third level caches by the prefetch engine. Prefetch requests from the prefetch engine to the second and third level caches is performed over a private prefetch request bus, which is separate from the bus system that transfers data from the various cache levels to the processor. A software instruction is used to accelerate the prefetch process by overriding the normal functionality of the hardware prefetch engine. The instruction also limits the amount of data to be prefetched.