Abstract:
A method and apparatus for translating a multithread program code are provided. The method includes: dividing a multithread program code into a plurality of statements according to a synchronization point; generating at least one loop group by combining one or more adjacent statements based on a number of instructions included in the plurality of statements; expanding or renaming variables in each of the plurality of statements so that each statement included in the at least one loop group is executed with respect to a work item of a different work group; and enclosing each of the generated at least one loop group respectively with a work item coalescing loop.
Abstract:
An apparatus and method for executing code are provided. The apparatus includes a memory manager that allocates a stack in memory to store processed data that needs to be retained; a loop generator that divides program code programmed to be processed in parallel into regions based on a barrier function, transforms a region that includes the processed data that needs to be retained in the stack into a first coalescing loop, and transforms a region that uses the processed data stored in the stack into a second coalescing loop such that the transformed program code may be serially processed; and a loop changer that reverses a processing order of the second coalescing loop in comparison to a processing order of the first coalescing loop.
Abstract:
An apparatus and method for generating vector code are provided. The apparatus and method generate vector code using scalar-type kernel code, without user's changing a code type or modifying data layout, thereby enhancing user's convenience of use and retaining the portability of OpenCL.