Abstract:
A method and apparatus for efficiently caching streaming and non-streaming data is described herein. Software, such as a compiler, identifies last use streaming instructions/operations that are the last instruction/operation to access streaming data for a number of instructions or an amount of time. As a result of performing an access to a cache line for a last use instruction/operation, the cache line is updated to a streaming data no longer needed (SDN) state. When control logic is to determine a cache line to be replaced, a modified Least Recently Used (LRU) algorithm is biased to select SDN state lines first to replace no longer needed streaming data.
Abstract:
A method and apparatus for efficiently handling texture sampling is described herein. A compiler or other software is capable of breaking a texture sampling operation for a pixel into a pre-fetch operation and a use operation. A processing element, in response to executing the pre-fetch operation, delegates computation of the texture sample of the pixel to a hardware texture sample unit. In parallel to the hardware texture sample unit performing a texture sample for the pixel and providing the result, i.e. a textured pixel (texel), to a destination address, the processing element is capable of executing other independent code. After an amount of time, the processing element executes the use operation, such as a load operation to load the texel from the destination address.
Abstract:
A prefetching scheme to detect when a load misses the lower level cache and hits the next level cache. Consequently, the prefetching scheme utilizes the previous information for the cache miss to the lower level cache and hit to the next higher level of cache memory that may result in initiating a sidedoor prefetch load for fetching the previous or next cache line into the lower level cache. In order to generate an address for the sidedoor prefetch, a history of cache access is maintained in a queue.
Abstract:
According to one embodiment a computer system is disclosed. The computer system includes a central processing unit (CPU) and a cache memory coupled to the CPU. The cache memory includes a main cache having plurality of compressible cache lines to store additional data, and a plurality of storage pools to hold a segment of the additional data for one or more of the plurality of cache lines that are to be compressed.