摘要:
A method of prefetching data in a microprocessor includes identifying a data stream associated with a process and determining a depth associated with the data stream based upon prefetch factors including the number of currently concurrent data streams and data consumption rates associated with the concurrent data streams. Data prefetch requests are allocated with the data stream to reflect the determined depth of the data stream. Allocating data prefetch requests may include allocating prefetch requests for a number of cache lines away from the cache line currently being referenced, wherein the number of cache lines is equal to the determined depth. The method may include, responsive to determining the depth associated with a data stream, configuring prefetch hardware to reflect the determined depth for the identified data stream. Prefetch control bits in an instruction executed by the processor control the prefetch hardware configuration.
摘要:
A method of prefetching data in a microprocessor includes identifying a data stream associated with a process and determining a depth associated with the data stream based upon prefetch factors including the number of currently concurrent data streams and data consumption rates associated with the concurrent data streams. Data prefetch requests are allocated with the data stream to reflect the determined depth of the data stream. Allocating data prefetch requests may include allocating prefetch requests for a number of cache lines away from the cache line currently being referenced, wherein the number of cache lines is equal to the determined depth. The method may include, responsive to determining the depth associated with a data stream, configuring prefetch hardware to reflect the determined depth for the identified data stream. Prefetch control bits in an instruction executed by the processor control the prefetch hardware configuration.
摘要:
In a microprocessor having a load/store unit and prefetch hardware, the prefetch hardware includes a prefetch queue containing entries indicative of allocated data streams. A prefetch engine receives an address associated with a store instruction executed by the load/store unit. The prefetch engine determines whether to allocate an entry in the prefetch queue corresponding to the store instruction by comparing entries in the queue to a window of addresses encompassing multiple cache blocks, where the window of addresses is derived from the received address. The prefetch engine compares entries in the prefetch queue to a window of 2M contiguous cache blocks. The prefetch engine suppresses allocation of a new entry when any entry in the prefetch queue is within the address window. The prefetch engine further suppresses allocation of a new entry when the data address of the store instruction is equal to an address in a border area of the address window.
摘要:
Computer implemented method, system and computer program product for prefetching data in a data processing system. A computer implemented method for prefetching data in a data processing system includes generating attribute information of prior data streams by associating attributes of each prior data stream with a storage access instruction which caused allocation of the data stream, and then recording the generated attribute information. The recorded attribute information is accessed, and a behavior of a new data stream is modified using the accessed recorded attribute information.
摘要:
A method and logical apparatus for managing processing system resource use for speculative execution reduces the power and performance burden associated with inefficient speculative execution of program instructions. A measure of the efficiency of speculative execution is used to reduce resources allocated to a thread while the speculation efficiency is low. The resource control applied may be the number of instruction fetches allocated to the thread or the number of execution time slices. Alternatively, or in combination, the size of a prefetch instruction storage allocated to the thread may be limited. The control condition may be comparison of the number of correct or incorrect speculations to a threshold, comparison of the number of correct to incorrect speculations, or a more complex evaluator such as the size of a ratio of incorrect to total speculations.
摘要:
A method and system of modifying instructions forming a loop is provided. A method of modifying instructions forming a loop includes modifying instructions forming a loop including: determining static and dynamic characteristics for the instructions; selecting a modification factor for the instructions based on a number of separate equivalent sections forming a cache in a processor which is processing the instructions; and modifying the instructions to interleave the instructions in the loop according to the modification factor and the static and dynamic characteristics when the instructions satisfy a modification criteria based on the static and dynamic characteristics.
摘要:
A mechanism for minimizing effective memory latency without unnecessary cost through fine-grained software-directed data prefetching using integrated high-level and low-level code analysis and optimizations is provided. The mechanism identifies and classifies streams, identifies data that is most likely to incur a cache miss, exploits effective hardware prefetching to determine the proper number of streams to be prefetched, exploits effective data prefetching on different types of streams in order to eliminate redundant prefetching and avoid cache pollution, and uses high-level transformations with integrated lower level cost analysis in the instruction scheduler to schedule prefetch instructions effectively.
摘要:
A method, apparatus, and computer program product are disclosed for selectively prohibiting speculative conditional branch execution. A particular type of conditional branch instruction is selected. An indication is stored within each instruction that is the particular type of conditional branch instruction. A processor then fetches a first instruction from code that is to be executed. A determination is made regarding whether the first instruction includes the indication. In response to determining that the instruction includes the indication: speculative execution of the first instruction is prohibited, an actual location to which the first instruction will branch is resolved, and execution of the code is branched to the actual location. In response to determining that the instruction does not include the indication, the first instruction is speculatively executed.