摘要:
A mechanism for minimizing effective memory latency without unnecessary cost through fine-grained software-directed data prefetching using integrated high-level and low-level code analysis and optimizations is provided. The mechanism identifies and classifies streams, identifies data that is most likely to incur a cache miss, exploits effective hardware prefetching to determine the proper number of streams to be prefetched, exploits effective data prefetching on different types of streams in order to eliminate redundant prefetching and avoid cache pollution, and uses high-level transformations with integrated lower level cost analysis in the instruction scheduler to schedule prefetch instructions effectively.
摘要:
Systems, methods and computer products for compiler support for aggressive safe load speculation. Exemplary embodiments include a method for aggressive safe load speculation for a compiler in a computer system, the method including building a control flow graph, identifying both countable and non-countable loops, gathering a set of candidate loops for load speculation, and for each candidate loop in the set of candidate loops gathered for load speculation, computing an estimate of the iteration count, delay cycles, and code size, performing a profitability analysis and determining an unroll factor based on the delay cycles and the code size, transforming the loop by generating a prologue loop to achieve data alignment and an unrolled main loop with loop directives, indicating which loads can safely be executed speculatively and performing low-level instruction scheduling on the generated unrolled main loop.
摘要:
Systems, methods and computer products for compiler support for aggressive safe load speculation. Exemplary embodiments include a method for aggressive safe load speculation for a compiler in a computer system, the method including building a control flow graph, identifying both countable and non-countable loops, gathering a set of candidate loops for load speculation, for each candidate loop in the set of candidate loops gathered for load speculation performing computing an estimate of the iteration count, delay cycles, and code size, performing a profitability analysis and determine an unroll factor based on the delay cycles and the code size, transforming the loop by generating a prologue loop to achieve data alignment and an unrolled main loop with loop directives, indicating which loads can safely be executed speculatively and performing low-level instruction on the generated unrolled main loop.
摘要:
A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted.
摘要:
A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted.
摘要:
A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted.
摘要:
A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted.
摘要:
A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted.
摘要:
Compiling code for an enhanced application binary interface (ABI) including identifying, by a computer, a code sequence configured to perform a variable address reference table function including an access to a variable at an offset outside of a location in a variable address reference table. The code sequence includes an internal representation (IR) of a first instruction and an IR of a second instruction. The second instruction is dependent on the first instruction. A scheduler cost function associated with at least one of the IR of the first instruction and the IR of the second instruction is modified. The modifying includes generating a modified scheduler cost function that is configured to place the first instruction next to the second instruction. An object file is generated responsive to the modified scheduler cost function. The object file includes the first instruction placed next to the second instruction. The object file is emitted.
摘要:
A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted.