Abstract:
Self-aware, peer-to-peer cache transfers between local, shared cache memories in a multi-processor system is disclosed. A shared cache memory system is provided comprising local shared cache memories accessible by an associated central processing unit (CPU) and other CPUs in a peer-to-peer manner. When a CPU desires to request a cache transfer (e.g., in response to a cache eviction), the CPU acting as a master CPU issues a cache transfer request. In response, target CPUs issue snoop responses indicating their willingness to accept the cache transfer. The target CPUs also use the snoop responses to be self-aware of the willingness of other target CPUs to accept the cache transfer. The target CPUs willing to accept the cache transfer use a predefined target CPU selection scheme to determine its acceptance of the cache transfer. This can avoid a CPU making multiple requests to find a target CPU for a cache transfer.
Abstract:
Converting a stale cache memory unique request to a read unique snoop response in a multiple (multi-) central processing unit (CPU) processor is disclosed. The multi-CPU processor includes a plurality of CPUs that each have access to either private or shared cache memories in a cache memory system. Multiple CPUs issuing unique requests to write data to a same coherence granule in a cache memory causes one unique request for a requested CPU to be serviced or "win" to allow that CPU to obtain the coherence granule in a unique state, while the other unsuccessful unique requests become stale. To avoid retried unique requests being reordered behind other pending, younger requests which would lead to lack of forward progress due to starvation or livelock, the snooped stale unique requests are converted to read unique snoop responses so that their request order can be maintained in the cache memory system.
Abstract:
The disclosure relates to filtering snoops in coherent multiprocessor systems. For example, in response to a request to update a target memory location at a Level-2 (L2) cache shared among multiple local processing units each having a Level-1 (L1) cache, a lookup based on the target memory location may be performed in a snoop filter that tracks entries in the LI caches. If the lookup misses the snoop filter and the snoop filter lacks space to store a new entry, a victim entry to evict from the snoop filter may be selected and a request to invalidate every cache line that maps to the victim entry may be sent to at least one of the processing units with one or more cache lines that map to the victim entry. The victim entry may then be replaced in the snoop filter with the new entry corresponding to the target memory location.