Victim prefetching in a cache hierarchy
    1.
    发明申请
    Victim prefetching in a cache hierarchy 失效
    受害者在缓存层次结构中预取

    公开(公告)号:US20060106991A1

    公开(公告)日:2006-05-18

    申请号:US10989997

    申请日:2004-11-16

    IPC分类号: G06F12/00

    摘要: We present a “directory extension” (hereinafter “DX”) to aid in prefetching between proximate levels in a cache hierarchy. The DX may maintain (1) a list of pages which contains recently ejected lines from a given level in the cache hierarchy, and (2) for each page in this list, the identity of a set of ejected lines, provided these lines are prefetchable from, for example, the next level of the cache hierarchy. Given a cache fault to a line within a page in this list, other lines from this page may then be prefetched without the substantial overhead to directory lookup which would otherwise be required.

    摘要翻译: 我们提出一个“目录扩展名”(以下简称“DX”)来辅助缓存层级中的邻近级别之间的预取。 DX可以维护(1)包含最近从缓存层级中的给定级别排出的行的页面列表,以及(2)对于该列表中的每个页面,提供这些行是可预取的集合的标识 从例如缓存层次结构的下一级。 给定列表中页面内的行的高速缓存错误,然后可以预取此页面中的其他行,而不需要大量开销,否则将需要目录查找。

    Data processing system and method for efficient communication utilizing an In coherency state
    2.
    发明申请
    Data processing system and method for efficient communication utilizing an In coherency state 有权
    数据处理系统和利用一致性状态的高效通信方法

    公开(公告)号:US20060179252A1

    公开(公告)日:2006-08-10

    申请号:US11055305

    申请日:2005-02-10

    IPC分类号: G06F13/28

    摘要: A cache coherent data processing system includes at least first and second coherency domains each including at least one processing unit. The first coherency domain includes a first cache memory, and the second coherency domain includes a coherent second cache memory. The first cache memory within the first coherency domain of the data processing system holds a memory block in a storage location associated with an address tag and a coherency state field. The coherency state field is set to a state that indicates that the address tag is valid, that the storage location does not contain valid data, and that the memory block is likely cached only within the first coherency domain.

    摘要翻译: 高速缓存一致数据处理系统至少包括第一和第二相关域,每个域包括至少一个处理单元。 第一相关域包括第一高速缓冲存储器,并且第二相干域包括相干第二高速缓冲存储器。 数据处理系统的第一相干域内的第一高速缓冲存储器在与地址标签和一致性状态字段相关联的存储位置中保存存储器块。 相关性状态字段被设置为指示地址标签有效的状态,存储位置不包含有效数据,并且该存储器块可能仅在第一相干域内被缓存。

    Method, apparatus, and computer program product for a cache coherency protocol state that predicts locations of shared memory blocks
    5.
    发明申请
    Method, apparatus, and computer program product for a cache coherency protocol state that predicts locations of shared memory blocks 有权
    用于预测共享存储器块的位置的高速缓存一致性协议状态的方法,装置和计算机程序产品

    公开(公告)号:US20070022256A1

    公开(公告)日:2007-01-25

    申请号:US11184315

    申请日:2005-07-19

    IPC分类号: G06F13/28

    摘要: A method, apparatus, and computer program product are disclosed for reducing the number of unnecessarily broadcast local requests to reduce the latency to access data from remote nodes in an SMP computer system. A shared invalid cache coherency protocol state is defined that predicts whether a memory read request to read data in a shared cache line can be satisfied within a local node. When a cache line is in the shared invalid state, a valid copy of the data is predicted to be located in the local node. When a cache line is in the invalid state and not in the shared invalid state, a valid copy of the data is predicted to be located in one of the remote nodes. Memory read requests to read data in a cache line that is not currently in the shared invalid state are broadcast first to remote nodes. Memory read requests to read data in a cache line that is currently in the shared invalid state are broadcast first to a local node, and in response to being unable to satisfy the memory read requests within the local node, the memory read requests are broadcast to the remote nodes.

    摘要翻译: 公开了用于减少不必要地广播的本地请求的数量以减少从SMP计算机系统中的远程节点访问数据的等待时间的方法,装置和计算机程序产品。 定义共享的无效高速缓存一致性协议状态,其预测在本地节点内是否可以满足在共享高速缓存行中读取数据的存储器读取请求。 当高速缓存行处于共享无效状态时,预测数据的有效副本位于本地节点中。 当高速缓存行处于无效状态而不处于共享无效状态时,预测数据的有效副本位于远程节点之一中。 在当前处于共享无效状态的缓存行中读取数据的内存读取请求首先被广播到远程节点。 在当前处于共享无效状态的高速缓存行中读取数据的存储器读取请求首先被广播到本地节点,并且响应于不能满足本地节点内的存储器读取请求,存储器读取请求被广播到 远程节点。

    Method, apparatus, and computer program product for a cache coherency protocol state that predicts locations of modified memory blocks
    6.
    发明申请
    Method, apparatus, and computer program product for a cache coherency protocol state that predicts locations of modified memory blocks 失效
    用于预测修改的存储器块的位置的高速缓存一致性协议状态的方法,装置和计算机程序产品

    公开(公告)号:US20070022255A1

    公开(公告)日:2007-01-25

    申请号:US11184314

    申请日:2005-07-19

    IPC分类号: G06F13/28

    摘要: A method, apparatus, and computer program product are disclosed for reducing the number of unnecessarily broadcast remote requests to reduce the latency to access data from local nodes and to reduce global traffic in an SMP computer system. A modified invalid cache coherency protocol state is defined that predicts whether a memory access request to read or write data in a cache line can be satisfied within a local node. When a cache line is in the modified invalid state, the only valid copies of the data are predicted to be located in the local node. When a cache line is in the invalid state and not in the modified invalid state, a valid copy of the data is predicted to be located in one of the remote nodes. Memory access requests to read exclusive or write data in a cache line that is not currently in the modified invalid state are broadcast first to all nodes. Memory access requests to read exclusive or write data in a cache line that is currently in the modified invalid state are broadcast first to a local node, and in response to being unable to satisfy the memory access requests within the local node, the memory access requests are broadcast to the remote nodes.

    摘要翻译: 公开了一种方法,装置和计算机程序产品,用于减少不必要地广播的远程请求的数量,以减少从本地节点访问数据的等待时间并减少SMP计算机系统中的全局流量。 定义了修改的无效高速缓存一致性协议状态,其预测在本地节点内是否可以满足在高速缓存行中读取或写入数据的存储器访问请求。 当缓存行处于修改的无效状态时,数据的唯一有效副本被预测位于本地节点中。 当高速缓存行处于无效状态而不处于修改的无效状态时,预测数据的有效副本位于远程节点之一中。 在当前处于修改的无效状态的高速缓存行中读取独占或写入数据的存储器访问请求首先被广播到所有节点。 在当前处于修改的无效状态的高速缓存行中读取独占或写入数据的存储器访问请求首先被广播到本地节点,并且响应于不能满足本地节点内的存储器访问请求,存储器访问请求 广播到远程节点。