摘要:
A technique for performing indirect data prefetching includes determining a first memory address of a pointer associated with a data prefetch instruction. Content of a memory at the first memory address is then fetched. A second memory address is determined from the content of the memory at the first memory address. Finally, a data block (e.g., a cache line) including data at the second memory address is fetched (e.g., from the memory or another memory).
摘要:
A method for compiler assisted victim cache bypassing including: identifying a cache line as a candidate for victim cache bypassing; conveying a bypassing-the-victim-cache information to a hardware; and checking a state of the cache line to determine a modified state of the cache line, wherein the cache line is identified for cache bypassing if the cache line that has no reuse within a loop or loop nest and there is no immediate loop reuse or there is a substantial across loop reuse distance so that it will be replaced from both main and victim cache before being reused.
摘要:
A processor includes an execution unit and instruction sequencing logic that fetches instructions for execution. The instruction sequencing logic includes a branch target address cache having a branch target buffer containing a plurality of entries each associating at least a portion of a branch instruction address with a predicted branch target address. The branch target address cache accesses the branch target buffer using a branch instruction address to obtain a predicted branch target address for use as an instruction fetch address. The branch target address cache also includes a filter buffer that buffers one or more candidate branch target address predictions. The filter buffer associates a respective confidence indication indicative of predictive accuracy with each candidate branch target address prediction. The branch target address cache promotes candidate branch target address predictions from the filter buffer to the branch target buffer based upon their respective confidence indications.
摘要:
A data processing unit, method, and computer-usable medium for contention-based cache performance optimization. Two or more processing cores are coupled by an interconnect. Coupled to the interconnect is a memory hierarchy that includes a collection of caches. Resource utilization over a time interval is detected over the interconnect. Responsive to detecting a threshold of resource utilization of the interconnect, a functional mode of a cache from the collection of caches is selectively enabled.