Abstract:
Systems and methods for dynamic granularity control of parallelized work in a heterogeneous multi-processor portable computing device (PCD) are provided. During operation a first parallelized portion of an application executing on the PCD is identified. The first parallelized portion comprising a plurality of threads for parallel execution on the PCD. Performance information is obtained about a plurality of processors of the PCD, each of the plurality of processors corresponding to one of the plurality of threads. A number M of workload partition granularities for the plurality of threads is determined, and a total execution cost for each of the M workload partition granularities is determined. An optimal granularity comprising a one of the M workload partition granularities with a lowest total execution cost is determined, and the first parallelized portion is partitioned into a plurality of workloads having the optimal granularity.
Abstract:
Various embodiments of methods and systems for proactive resource allocation and configuration are disclosed. An exemplary method first compiles and links a profile instrumented application with a compiler comprising a profile guided optimization feature that inserts calls to a profiler runtime. The profile instrumented application is executed on a target device using one or more workload datasets representative of probable workloads. During execution, based on recognition of the inserted calls, an instrumentation-based profile dataset is generated in association with each of the one or more workload datasets. Next, the profile instrumented application is recompiled and relinked based on the instrumentation-based profile datasets to create a set of profile guided optimizations to the source code, thereby resulting in an optimized application. The optimized application may be executed and monitored to generate a revised profile dataset useful for providing instructions to the target device for optimal workload allocation and resource configuration.