摘要:
A requesting critical wait time of a given resource may be determined The requesting critical wait time is the time spent by the one or more resources waiting for the given resource, wherein at least one of the resources waiting for the given resource can proceed if access to the given resource is granted. A requested by critical wait time for a resource is determined, the requested by critical wait time being time spent by the resource for waiting solely for the given resource, wherein if the resource were granted access to the given resource, the resource can proceed without further waiting.
摘要:
A requesting critical wait time of a given resource may be determined. The requesting critical wait time is the time spent by the one or more resources waiting for the given resource, wherein at least one of the resources waiting for the given resource can proceed if access to the given resource is granted. A requested by critical wait time for a resource is determined, the requested by critical wait time being time spent by the resource for waiting solely for the given resource, wherein if the resource were granted access to the given resource, the resource can proceed without further waiting.
摘要:
A method for modeling the performance of memory address translation mechanism (MATM), comprises: a) receiving an execution profile that contains a memory address reference stream of an application, a set of page size mappings, and events about the application's data allocations and de-allocations; b) translating each memory reference in the input memory reference stream into a reference to the corresponding data object, by consulting the memory allocation and de-allocation events, to provide a data object reference stream; c) translating each data object reference into a corresponding page reference by consulting the page size mapping and by modeling the data allocation and de-allocation events in accordance with the mapping to provide a page reference stream and a number of pages of each page size that are needed by the respective mapping; d) using the page reference stream to provide a stream of reuse distance values; e) determining, for each reference in the reuse distance value stream, whether the reference results in a hit or a miss reference to the MATM to provide the number of hits and the number of misses for each MATM; f) providing the hit and miss values to a cost model to estimate the number of miss cycles; g) ranking the mappings by their miss cycle values such that the mapping with the lowest number of miss cycles has the highest rank.
摘要:
A method for vertical integrated performance and environment monitoring includes steps, or acts, of: defining one or more events to provide a unified specification; registering one or more events to be detected; detecting an occurrence of at least one of the registered event or events; generating a monitoring entry each time one of the registered events is detected; and entering each of the monitoring entries generated into a single logical entity.
摘要:
A system and method for mapping application tasks to processors in a computing environment that takes into account the hardware communication topology of a machine and an application communication pattern. The hardware communication topology (HCT) is defined according to hardware parameters affecting communication between two tasks, such as connectivity, bandwidth and latency; and, the application communication pattern (ACP) is defined to mean the number and size of bytes that are communicated between the different pairs of communicating tasks. By collecting information on the messages exchanged by the tasks that communicate, the communication pattern of the application may be determined. By combing the HCT and ACP a cost model for a given mapping can be determined. Any algorithm computing a mapping can use the HCT, ACP, and the cost model, thus the combination of an HCT, ACP, and cost model allow an automatically optimized mapping of tasks to processing elements to be achieved
摘要:
Adaptive profiling for performance analysis of a computer system controls one or more agents to monitor a plurality of events occurring in a target computer system based on an adaptive logic. Collected data may be filtered and analyzed to determine one or more contributor events that attribute to performance of the target computer system. One or more patterns are observed or detected in said collected data, behavior of said one or more agents are adjusted based on said detected one or more patterns. The adaptive logic may be further reconfigured based on said detected one or more patterns.
摘要:
Interactive iterative program parallelization based on dynamic feedback program parallelization, in one aspect, may identify a ranked list of one or more candidate pieces of code each with one or more source refactorings that can be applied to parallelize the code, apply at least one of the one or more refactorings to create a revised code, and determine performance data associated with the revised code. The performance data may be used to make decisions on identifying next possible ranked list of refactorings.
摘要:
Disclosed are a method and system for predicting future values of a target metric associated with a task executed on a computer system. The method comprises the steps of, over a given period of time, measuring at least one defined metric, transforming that measurement into a value for a predictor source metric, and using the value for the predictor source metric to obtain a predicted future value for said target metric. The preferred embodiment of this invention provides a flexible performance multi-predictor to solve the problem of providing accurate future behavior predictions for adaptive reconfiguration systems. The multi-predictor makes predictions about future workload characteristic by periodically reading available hardware counters. Also disclosed is a method and system for periodically reconfiguring an adaptive computer system by rescheduling tasks based on future behavior predictions.
摘要:
In a system and method for linking and unlinking code fragments stored in a code cache, a memory area is associated with a branch in a first code fragment that branches outside the cache. If the branch can be set to branch to a location in a second code fragment stored in the cache, branch reconstruction information is stored in the memory area associated with the branch, and the branch instruction is updated to branch to the location in the second code fragment, thereby linking the first code fragment to the second code fragment. If it is determined that the previously linked branch should be unlinked, the first and second code fragments at that branch are unlinked by reading the information stored in the associated memory area at the time of linking, and using that information to reset the branch to its state prior to the linking.
摘要:
Disclosed are a method and system for predicting future values of a target metric associated with a task executed on a computer system. The method comprises the steps of, over a given period of time, measuring at least one defined metric, transforming that measurement into a value for a predictor source metric, and using the value for the predictor source metric to obtain a predicted future value for said target metric. The preferred embodiment of this invention provides a flexible performance multi-predictor to solve the problem of providing accurate future behavior predictions for adaptive reconfiguration systems. The multi-predictor makes predictions about future workload characteristic by periodically reading available hardware counters. Also disclosed is a method and system for periodically reconfiguring an adaptive computer system by rescheduling tasks based on future behavior predictions.