摘要:
A test system or simulator includes an enhanced IC test application sampling software program that executes test application software on a semiconductor die IC design model. The enhanced test application sampling software may include trace, simulation point, CPI error, clustering, instruction budgeting, and other programs. The enhanced test application sampling software generates basic block vectors (BBVs) and fly-by vectors (FBVs) from instruction trace analysis of test application software workloads. The enhanced test application sampling software utilizes the microarchitecture dependent information to generate the FBVs to select representative instruction intervals from the test application software. The enhanced test application sampling software generates a reduced representative test application software program from the BBV and FBV data utilizing a global instruction budgeting analysis method. Designers use the test system with enhanced test application sampling software to evaluate IC design models by using the representative test application software program.
摘要:
A subset of a workload, which includes a total set of dynamic instructions, is identified to use as a trace. Processor unit hardware executes the entire workload in real-time using a particular dataset. The processor unit hardware includes at least one microprocessor and at least one cache. The real-time execution of the workload is monitored to obtain information about how the processor unit hardware executes the workload when the workload is executed using the particular dataset to form actual performance information. Multiple different subsets of the workload are generated. The execution of each one of the subsets by the processor unit hardware is compared with the actual performance information. A result of the comparison is used to select one of the plurality of different subsets that most closely represents the execution of the entire workload using the particular dataset to use as a trace.
摘要:
A method, apparatus, and computer-usable program code in a computer system for identifying a subset of a workload, which includes a total set of dynamic instructions, to use as a trace. Processor unit hardware executes the entire workload in real-time using a particular dataset. The processor unit hardware includes at least one microprocessor and at least one cache. The real-time execution of the workload is monitored to obtain information about how the processor unit hardware executes the workload when the workload is executed using the particular dataset to form actual performance information. Multiple different subsets of the workload are generated. The execution of each one of the subsets by the processor unit hardware is compared with the actual performance information. A result of the comparison is used to select one of the plurality of different subsets that roost closely represents the execution of the entire workload using the particular dataset to use as a trace.
摘要:
An information handling system includes a processor that executes multiple instructions or instruction threads within a software application program. The information handling system includes operating system software that manages processor system hardware and software in a multi-tasking environment. In one embodiment, the operating system manages instruction completion stall analysis software to determine the cause or causes of instruction stalls. In another embodiment, the stall analysis software cooperates with the operating system software to store instruction completion stall event data on a per instruction basis while the application program executes. The operating system software may cooperate with the stall analysis software to store instruction completion stall data in memory for later manipulation by system users or other software.
摘要:
An information handling system includes a processor that executes multiple instructions or instruction threads within a software application program. The information handling system includes operating system software that manages processor system hardware and software in a multi-tasking environment. In one embodiment, the operating system manages instruction completion stall analysis software to determine the cause or causes of instruction stalls. In another embodiment, the stall analysis software cooperates with the operating system software to store instruction completion stall event data on a per instruction basis while the application program executes. The operating system software may cooperate with the stall analysis software to store instruction completion stall data in memory for later manipulation by system users or other software.
摘要:
A method, computer program product, and data processing system for collecting metrics regarding completion stalls in an out-of-order superscalar processor with branch prediction is disclosed. A preferred embodiment of the present invention selectively samples particular instructions (or classes of instructions). Each selected instruction, as it passes through the processor datapath, is marked (tagged) for monitoring by a performance monitoring unit. The progress of marked instructions is monitored by the performance monitoring unit, and various stall counters are triggered by the progress of the marked instructions and the instruction groups they form a part of. The stall counters count cycles to give an indication of when certain delays associated with particular instructions occur and how serious the delays are.
摘要:
A computer implemented method, apparatus, and computer program product for monitoring execution of instructions in an instruction pipeline. The process identifies a number of stall cycles for a group of instructions to complete execution. The process retrieves a deterministic latency pattern corresponding to the group of instructions. The process compares the number of stall cycles to the deterministic execution latency pattern. The process identifies the instruction as a dependent instruction in response to a determination that an instruction in the group of instructions completed a deterministic number of cycles after an antecedent instruction completed.
摘要:
A computer implemented method, apparatus, and computer program product for monitoring execution of instructions in an instruction pipeline. The process identifies a number of stall cycles for a group of instructions to complete execution. The process retrieves a deterministic latency pattern corresponding to the group of instructions. The process compares the number of stall cycles to the deterministic execution latency pattern. The process identifies the instruction as a dependent instruction in response to a determination that an instruction in the group of instructions completed a deterministic number of cycles after an antecedent instruction completed.
摘要:
A performance monitor including a saturating counter provides a relative measure of event frequency without requiring a minimum polling rate or periodic reset to avoid or account for counter overflow. The saturating counter is incremented upon detection of an event and decremented if an event is not detected within a predetermined period. The period of detecting may be programmable and may be determined by real time clock, processor or instruction cycles. Multiple event types may be selected from for detection and input to a single counter, or alternatively multiple event counters may be provided for various event types. The saturating counter may additionally be periodically reset in a selected operating mode, in combination with the decrementing action performed on the counter.
摘要:
A performance monitor including a saturating counter provides a relative measure of event frequency without requiring a minimum polling rate or periodic reset to avoid or account for counter overflow. The saturating counter is incremented upon detection of an event and decremented if an event is not detected within a predetermined period. The period of detecting may be programmable and may be determined by real time clock, processor or instruction cycles. Multiple event types may be selected from for detection and input to a single counter, or alternatively multiple event counters may be provided for various event types. The saturating counter may additionally be periodically reset in a selected operating mode, in combination with the decrementing action performed on the counter.