Source-to-source compiler and run-time library to transparently accelerate stack or queue-based irregular applications on many-core architectures
Abstract:
Systems and methods for system for source-to-source transformation for optimizing stacks and/or queues in an application, including identifying usage of stacks and queues in the application and collecting the resource usage and thread block configurations for the application. If the usage of stacks is identified, optimized code is generated by determining appropriate storage, partitioning stacks based on determined storage, and caching tops of the stacks in a register. If the identifier identifies usage of queues, optimized code is generated by combining queue operations in all threads in a warp/thread block into one batch queue operation, converting control divergence of the application to data divergence to enable warp-level queue operations, determining whether at least one of the threads includes a queue operation, and combining queue operations into threads in a warp.
Information query
Patent Agency Ranking
0/0