摘要:
A method estimates statistics of properties of transactions processed by a memory sub-system of a computer system. The method randomly selects memory transactions processed by the memory sub-system. States of the system are recorded as samples while the selected transaction are processed by the memory sub-system. The recorded states from a subset of the selected transactions are statistically analyzed to estimate statistics of the memory transactions.
摘要:
A method for sampling the performance of a computer system is provided. The computer system includes a plurality of functional units. The method selects transactions to be processed by a particular functional unit of the computer system. State information is stored while the selected transactions are processed by the functional unit. The state information is analyzed to guide optimization.
摘要:
A method is provided for scheduling execution of a plurality of threads executed in a multithreaded processor. Resource utilizations of each of the plurality of threads are measured while the plurality of threads are concurrently executing in the multithreaded processor. Each of the plurality of threads is scheduled according to the measured resource utilizations using a thread scheduler.
摘要:
An apparatus for sampling states of a computer system having a hierarchical memory arranged at a plurality of levels, the hierarchical memory storing data at addresses. The apparatus includes a selector for selecting memory transactions based on first state and transaction information. The memory transactions are to be processed by the hierarchical memory. A trigger activates the selector based on second state and transaction information. A sampler stores states of the computer system that are identified with the selected instructions while processing the selected memory transactions in the hierarchical memory.
摘要:
An apparatus is provided for determining an average number of instructions entering a stage of a processor pipeline of a computer system during a clock cycle of a processor clock. The number of instructions entering a particular stage of the pipeline are stored in a queue during each of a predetermined number (N) of clock cycles. The total number of instructions processed over the last P clock cycles is computed, where P is less than or equal to N. The total number of instructions processed is divided by the last P processor cycles to yield the instantaneous average number of instructions processed for each processor cycle. This average number of instructions processed is communicated to software.
摘要:
In a method for scheduling instructions executed in a computer system including a processor and a memory subsystem, pipeline latencies and resource utilization are measured by sampling hardware while the instructions are executing. The instructions are then scheduled according to the measured latencies and resource utilizations using an instruction scheduler.
摘要:
In a computer system, an apparatus is configured to collect performance data of a computer system including a plurality of processors for concurrently executing instructions of a program. A plurality of performance counters are coupled to each processor. The performance counters store performance data generated by each processor while executing the instructions. An interrupt handler executes on each processors, the interrupt handler samples the performance data of the processor in response to interrupts. A first memory includes a hash table associated with each interrupt handler, the hash table stores the performance data sampled by the interrupt handler executing on the processor. A second memory includes an overflow buffer, the overflow buffer stores the performance data while portions of the hash tables are active or full. A third memory includes a user buffer, and means are provided for periodically flushing the performance data from the hash tables and the overflow to the user buffer.
摘要:
An apparatus is provided for sampling instructions in a processor pipeline of a system. The pipeline has a plurality of processing stages. The apparatus includes a fetch unit for fetching instructions into a first stage of the pipeline. Certain randomly selected instructions are identified, and state information of the system is sampled while a particular selected instruction is in any stage of the pipeline. Software is informed when the particular selected instruction leaves the pipeline so that the software can read any of the sampled state information.
摘要:
An apparatus is provided for sampling values of operands of instructions in a processor pipeline of a system, the pipeline having a plurality of processing stages. Instructions are fetched into a first stage of the pipeline. Any one of the fetched instructions are identified as a particular selected instruction. Values of results computed during the processing of the particular selected instruction are recorded in a sampling record along with state information identifying the particular selected instruction. Software is informed whenever the particular selected instruction leaves the pipeline to read the recorded values and state information.
摘要:
A method is provided for estimating statistics of properties of instructions processed in a pipeline of a computer system, the pipeline having a plurality of processing stages. Instructions are fetched into a first stage of the pipeline. Some of the fetched instructions are randomly selected. State information of the system is recorded in a profile record as samples while the selected instruction are processed by the pipeline. The recorded state information is communiucated to software. The software statistically analyzes the recorded state information from a subset of the selected instructions to estimate the statistics of the instructions.