摘要:
Mechanisms are provided for optimizing irregular memory references in computer code. These mechanisms may parse the computer code to identify memory references in the computer code. These mechanisms may further classify the memory references in the computer code as either regular memory references or irregular memory references. Moreover, the mechanisms may transform the computer code, by a compiler, to generate transformed computer code in which irregular memory references access a storage of a software cache of a data processing system through a transactional cache mechanism of the software cache.
摘要:
Mechanisms for domain stretching for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization. In addition, mechanisms are provided for stretching the domains of statements in a program loop view of the source code to thereby normalize the domains.
摘要:
Mechanisms for stable transitions in the presence of conditionals for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization. In addition, mechanisms are provided for ensuring code stabilization in the presence of conditions such that code bloat is not encountered during re-entrance.
摘要:
A process, compiler, computer program product and system for workload partitioning in a heterogeneous system. The process includes determining heterogeneous alignment constraints in the workload, partitioning a portion of tasks to a processing element sensitive to alignment constraints, and partitioning a remaining portion of tasks to a processing element not sensitive to alignment constraints.
摘要:
A system and method for selective code generation optimization for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization.
摘要:
Mechanisms for optimized code generation targeting a high locality software cache are provided. Original computer code is parsed to identify memory references in the original computer code. Memory references are classified as either regular memory references or irregular memory references. Regular memory references are controlled by a high locality cache mechanism. Original computer code is transformed, by a compiler, to generate transformed computer code in which the regular memory references are grouped into one or more memory reference streams, each memory reference stream having a leading memory reference, a trailing memory reference, and one or more middle memory references. Transforming of the original computer code comprises inserting, into the original computer code, instructions to execute initialization, lookup, and cleanup operations associated with the leading memory reference and trailing memory reference in a different manner from initialization, lookup, and cleanup operations for the one or more middle memory references.
摘要:
Mechanisms for selective code generation optimization for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization.
摘要:
A system and method for domain stretching for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization. In addition, mechanisms are provided for stretching the domains of statements in a program loop view of the source code to thereby normalize the domains.
摘要:
A process, compiler, computer program product and system for workload partitioning in a heterogeneous system. The process includes determining heterogeneous alignment constraints in the workload, partitioning a portion of tasks to a processing element sensitive to alignment constraints, and partitioning a remaining portion of tasks to a processing element not sensitive to alignment constraints.
摘要:
A system and method to efficiently pre-fetch and batch compiler-assisted software cache accesses are provided. The system and method reduce the overhead associated with software cache directory accesses. With the system and method, the local memory address of the cache line that stores the pre-fetched data is itself cached, such as in a register or well known location in local memory, so that a later data access does not need to perform address translation and software cache operations and can instead access the data directly from the software cache using the cached local memory address. This saves processor cycles that would otherwise be required to perform the address translation a second time when the data is to be used. Moreover, the system and method directly enable software cache accesses to be effectively decoupled from address translation in order to increase the overlap between computation and communication.