摘要:
An apparatus for implementing snooping cache coherence that locally reduces the number of snoop requests presented to each cache in a multiprocessor system. A snoop filter device associated with a single processor includes one or more “scoreboard” data structures that make snoop determinations, i.e., for each snoop request from another processor, to determine if a request is to be forwarded to the processor or, discarded. At least one scoreboard is active, and at least one scoreboard is determined to be historic at any point in time. A snoop determination of the queue indicates that an entry may be in the cache, but does not indicate its actual residence status. In addition, the snoop filter block implementing scoreboard data structures is operatively coupled with a cache wrap detection logic means whereby, upon detection of a cache wrap condition, the content of the active scoreboard is copied into a historic scoreboard and the content of at least one active scoreboard is reset.
摘要:
Disclosed is a method and apparatus providing the capability to supplement a branch target buffer (BTB) with a recent entry queue. A recent entry queue prevents unnecessary removal of valuable BTB data of multiple entries for another entry. Additional, the recent entry queue detects when the latency of the BTB's startup latency is preventing it from asynchronous aiding the microprocessor pipeline as designed for and thereby can delay the pipeline in the required situations such that the BTB latency on startup can be overcome. Finally, the recent entry queue provides a quick access to BTB entries that are accessed in a tight loop pattern where the throughput of the standalone BTB is unable to track the throughput of the microprocessor execution pipeline. Through the usage of the recent entry queue, the modified BTB is capable of processing information at the rate of the execution pipeline thereby accelerating the execution pipeline.
摘要:
A memory system and method includes a cache having a filtered portion and an unfiltered portion. The filtered portion is divided into block sized components, and the unfiltered portion is divided into sub-block sized components. Blocks evicted from the filtered portion have selected sub-blocks thereof cached in the unfiltered portion for servicing requests.
摘要:
A memory storage structure includes a memory storage device, and a first meta-structure having a first size and operating at a first speed. The first speed is faster than a second speed for storing meta-information based on information stored in a memory. A second meta-structure is hierarchically associated with the first meta-structure. The second meta-structure has a second size larger than the first size and operates at the second speed such that faster and more accurate prefetching is provided by coaction of the first and second meta-structures. A method is provided to assemble the meta-information in the first meta-structure and copy this information to the second meta-structure, and prefetching the stored information from the second meta-structure to the first meta-structure ahead of its use.
摘要:
A memory storage structure includes a memory storage device, and a first meta-structure having a first size and operating at a first speed. The first speed is faster than a second speed for storing meta-information based on information stored in a memory. A second meta-structure is hierarchically associated with the first meta-structure. The second meta-structure has a second size larger than the first size and operates at the second speed such that faster and more accurate prefetching is provided by coaction of the first and second meta-structures. A method is provided to assemble the meta-information in the first meta-structure and copy this information to the second meta-structure, and prefetching the stored information from the second meta-structure to the first meta-structure ahead of its use.