Method and apparatus for efficient ordered stores over an interconnection network
    1.
    发明申请
    Method and apparatus for efficient ordered stores over an interconnection network 有权
    通过互连网络实现有效存储的方法和装置

    公开(公告)号:US20050091121A1

    公开(公告)日:2005-04-28

    申请号:US10691176

    申请日:2003-10-22

    IPC分类号: G06F12/08 G06F17/60

    CPC分类号: G06F12/0813 G06Q30/0601

    摘要: A physically distributed cache memory system includes an interconnection network, first level cache memory slices, and second level cache memory slices. The first level cache memory slices are coupled to the interconnection network to generate tagged ordered store requests. Each tagged ordered store requests has a tag including requester identification and a store sequence token. The second level cache memory slices are coupled to the interconnection network to execute ordered store requests in-order across the physically distributed cache memory system in response to each tag of the tagged ordered store requests.

    摘要翻译: 物理分布式高速缓冲存储器系统包括互连网络,第一级高速缓存存储器片和第二级高速缓存存储器片。 第一级缓存存储器片耦合到互连网络以生成标记的有序存储请求。 每个标记的有序存储请求都具有包括请求者标识和存储序列令牌的标签。 第二级高速缓存存储器片耦合到互连网络,以响应于标记的有序存储请求的每个标签跨物理分布式高速缓冲存储器系统按顺序执行有序存储请求。

    Per-set relaxation of cache inclusion
    2.
    发明申请
    Per-set relaxation of cache inclusion 审中-公开
    缓存包容的放松

    公开(公告)号:US20070143550A1

    公开(公告)日:2007-06-21

    申请号:US11313114

    申请日:2005-12-19

    IPC分类号: G06F13/28

    CPC分类号: G06F12/0811 G06F12/084

    摘要: A multi-core processor includes a plurality of processors and a shared cache. Cache control logic implements an inclusive cache scheme among the shared cache and the local caches for the processors. Counters are maintained to track instances, per set, when a processor chooses to delay eviction from the local cache. While the counter indicates that one or more delayed evictions are pending for a set, the cache control logic treats the set as non-inclusive, broadcasting foreign snoops to all of the local caches, regardless of whether the snoop hits in the shared cache. Other embodiments are also described and claimed.

    摘要翻译: 多核处理器包括多个处理器和共享高速缓存。 缓存控制逻辑在共享高速缓存和处理器的本地高速缓存之间实现包容性高速缓存方案。 当处理器选择延迟从本地缓存驱逐时,计数器被维护以跟踪每集的实例。 虽然计数器指示一个或多个延迟的撤离正在等待一组,但是高速缓存控制逻辑将该集合视为非包容性,将广播外部侦听广播到所有本地高速缓存,而不管窥探者是否在共享高速缓存中命中。 还描述和要求保护其他实施例。

    Caching in multicore and multiprocessor architectures
    6.
    发明授权
    Caching in multicore and multiprocessor architectures 有权
    在多核和多处理器架构中进行缓存

    公开(公告)号:US08560780B1

    公开(公告)日:2013-10-15

    申请号:US13553884

    申请日:2012-07-20

    IPC分类号: G06F12/00

    摘要: A multicore processor comprises a plurality of cache memories, and a plurality of processor cores, each associated with one of the cache memories. Each of at least some of the cache memories is configured to maintain at least a portion of the cache memory in which each cache line is dynamically managed as either local to the associated processor core or shared among multiple processor cores.

    摘要翻译: 多核处理器包括多个高速缓存存储器和多个处理器核心,每个处理器核心与高速缓冲存储器之一相关联。 至少一些高速缓存存储器中的每一个被配置为保持高速缓冲存储器的至少一部分,其中每个高速缓存行被动态地管理为相关联的处理器核心的本地或在多个处理器核心之间共享。

    Predictive early write-back of owned cache blocks in a shared memory computer system
    8.
    发明授权
    Predictive early write-back of owned cache blocks in a shared memory computer system 有权
    在共享内存计算机系统中预测所有缓存块的早期回写

    公开(公告)号:US07624236B2

    公开(公告)日:2009-11-24

    申请号:US11023882

    申请日:2004-12-27

    IPC分类号: G06F12/00 G06F13/00 G06F13/28

    摘要: A method for predicting early write back of owned cache blocks in a shared memory computer system. This invention enables the system to predict which written blocks may be more likely to be requested by another CPU and the owning CPU will write those blocks back to memory as soon as possible after updating the data in the block. If another processor is requesting the data, this can reduce the latency to get that data, reducing synchronization overhead, and increasing the throughput of parallel programs.

    摘要翻译: 一种用于预测共享存储器计算机系统中所拥有的高速缓存块的早期回写的方法。 本发明使得系统能够预测哪些写入块可能被另一个CPU更可能请求,并且拥有的CPU将在更新块中的数据之后尽快将这些块写回存储器。 如果另一个处理器正在请求数据,则可以减少获取该数据的延迟,减少同步开销,并增加并行程序的吞吐量。