LOW-COST CACHE COHERENCY FOR ACCELERATORS
    1.
    发明申请
    LOW-COST CACHE COHERENCY FOR ACCELERATORS 失效
    用于加速器的低成本高速缓存

    公开(公告)号:US20110029738A1

    公开(公告)日:2011-02-03

    申请号:US12902045

    申请日:2010-10-11

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0817 G06F2212/1016

    摘要: Embodiments of the invention provide methods and systems for reducing the consumption of inter-node bandwidth by communications maintaining coherence between accelerators and CPUs. The CPUs and the accelerators may be clustered on separate nodes in a multiprocessing environment. Each node that contains a shared memory device may maintain a directory to track blocks of shared memory that may have been cached at other nodes. Therefore, commands and addresses may be transmitted to processors and accelerators at other nodes only if a memory location has been cached outside of a node. Additionally, because accelerators generally do not access the same data as CPUs, only initial read, write, and synchronization operations may be transmitted to other nodes. Intermediate accesses to data may be performed non-coherently. As a result, the inter-chip bandwidth consumed for maintaining coherence may be reduced.

    摘要翻译: 本发明的实施例提供了通过保持加速器和CPU之间的一致性来减少节点间带宽消耗的方法和系统。 CPU和加速器可以聚集在多处理环境中的单独的节点上。 包含共享存储器设备的每个节点可以维护目录以跟踪可能在其他节点处被缓存的共享存储器的块。 因此,只有当存储器位置已被缓存在节点外部时,命令和地址才可以发送到其他节点上的处理器和加速器。 另外,因为加速器通常不能访问与CPU相同的数据,所以只能将初始读,写和同步操作传输到其他节点。 对数据的中间访问可以非相干地执行。 结果,可以减少用于维持一致性所消耗的芯片间带宽。

    Low-cost cache coherency for accelerators
    2.
    发明授权
    Low-cost cache coherency for accelerators 有权
    加速器的低成本缓存一致性

    公开(公告)号:US07814279B2

    公开(公告)日:2010-10-12

    申请号:US11388013

    申请日:2006-03-23

    IPC分类号: G06F13/00 G06F12/08

    CPC分类号: G06F12/0817 G06F2212/1016

    摘要: Embodiments of the invention provide methods and systems for reducing the consumption of inter-node bandwidth by communications maintaining coherence between accelerators and CPUs. The CPUs and the accelerators may be clustered on separate nodes in a multiprocessing environment. Each node that contains a shared memory device may maintain a directory to track blocks of shared memory that may have been cached at other nodes. Therefore, commands and addresses may be transmitted to processors and accelerators at other nodes only if a memory location has been cached outside of a node. Additionally, because accelerators generally do not access the same data as CPUs, only initial read, write, and synchronization operations may be transmitted to other nodes. Intermediate accesses to data may be performed non-coherently. As a result, the inter-chip bandwidth consumed for maintaining coherence may be reduced.

    摘要翻译: 本发明的实施例提供了通过保持加速器和CPU之间的一致性来减少节点间带宽消耗的方法和系统。 CPU和加速器可以聚集在多处理环境中的单独的节点上。 包含共享存储器设备的每个节点可以维护目录以跟踪可能在其他节点处被缓存的共享存储器的块。 因此,只有当存储器位置已被缓存在节点外部时,命令和地址才可以发送到其他节点上的处理器和加速器。 另外,因为加速器通常不能访问与CPU相同的数据,所以只能将初始读,写和同步操作传输到其他节点。 对数据的中间访问可以非相干地执行。 结果,可以减少用于维持一致性所消耗的芯片间带宽。

    Low-cost cache coherency for accelerators
    4.
    发明授权
    Low-cost cache coherency for accelerators 失效
    加速器的低成本缓存一致性

    公开(公告)号:US08103835B2

    公开(公告)日:2012-01-24

    申请号:US12902045

    申请日:2010-10-11

    IPC分类号: G06F13/00 G06F13/28

    CPC分类号: G06F12/0817 G06F2212/1016

    摘要: Embodiments of the invention provide methods and systems for reducing the consumption of inter-node bandwidth by communications maintaining coherence between accelerators and CPUs. The CPUs and the accelerators may be clustered on separate nodes in a multiprocessing environment. Each node that contains a shared memory device may maintain a directory to track blocks of shared memory that may have been cached at other nodes. Therefore, commands and addresses may be transmitted to processors and accelerators at other nodes only if a memory location has been cached outside of a node. Additionally, because accelerators generally do not access the same data as CPUs, only initial read, write, and synchronization operations may be transmitted to other nodes. Intermediate accesses to data may be performed non-coherently. As a result, the inter-chip bandwidth consumed for maintaining coherence may be reduced.

    摘要翻译: 本发明的实施例提供了通过保持加速器和CPU之间的一致性来减少节点间带宽消耗的方法和系统。 CPU和加速器可以聚集在多处理环境中的单独的节点上。 包含共享存储器设备的每个节点可以维护目录以跟踪可能在其他节点处被缓存的共享存储器的块。 因此,只有当存储器位置已被缓存在节点外部时,命令和地址才可以发送到其他节点上的处理器和加速器。 另外,因为加速器通常不能访问与CPU相同的数据,所以只能将初始读,写和同步操作传输到其他节点。 对数据的中间访问可以非相干地执行。 结果,可以减少用于维持一致性所消耗的芯片间带宽。

    METHODS AND APPARATUS FOR TESTING A LINK BETWEEN CHIPS
    5.
    发明申请
    METHODS AND APPARATUS FOR TESTING A LINK BETWEEN CHIPS 审中-公开
    用于测试CHIPS之间的链接的方法和装置

    公开(公告)号:US20080133169A1

    公开(公告)日:2008-06-05

    申请号:US12016935

    申请日:2008-01-18

    IPC分类号: G06F19/00

    CPC分类号: G01R31/31717

    摘要: In a first aspect, a first method of testing a link between a first chip and a second chip is provided. The first method includes the steps of, while operating in a test mode, (1) transmitting test data of sufficient length to enable exercising of worst case transitions from the first chip to the second chip via the link; and (2) performing cyclic redundancy checking (CRC) on the test data to test the link. Numerous other aspects are provided.

    摘要翻译: 在第一方面,提供了测试第一芯片和第二芯片之间的链路的第一种方法。 第一种方法包括以下步骤:在测试模式下工作时,(1)发送具有足够长度的测试数据,以便能够通过该链路实现从第一芯片到第二芯片的最坏情况转换; 和(2)对测试数据执行循环冗余校验(CRC)以测试链路。 提供了许多其他方面。

    Directory for multi-node coherent bus
    8.
    发明授权
    Directory for multi-node coherent bus 有权
    多节点相干总线目录

    公开(公告)号:US07725660B2

    公开(公告)日:2010-05-25

    申请号:US11828448

    申请日:2007-07-26

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0822

    摘要: A method for maintaining cache coherency for a multi-node system using a specialized bridge which allows for fewer forward progress dependencies. A local node makes a determination whether a request is a local or system request. If the request is a local request, a look-up of a directory in the local node is performed. If an entry in the directory of the local node indicates that data in the request does not have a remote owner and that the request does not have a remote destination, the coherency of the data is resolved on the local node, and a transfer of the data specified in the request is performed if required and if the request is a local request. If the entry indicates that the data has a remote owner or that the request has a remote destination, the request is forwarded to all remote nodes in the multi-node system.

    摘要翻译: 一种使用允许较少前进进度依赖性的专用桥来维护多节点系统的高速缓存一致性的方法。 本地节点确定请求是本地还是系统请求。 如果请求是本地请求,则执行本地节点中的目录的查找。 如果本地节点目录中的条目指示请求中的数据不具有远程所有者,并且请求没有远程目标,则在本地节点上解析数据的一致性,并且传输 如果需要,请求中指定的数据将被执行,并且请求是本地请求。 如果条目指示数据具有远程所有者或请求具有远程目标,则将请求转发到多节点系统中的所有远程节点。

    System and method for flexible multiple protocols
    9.
    发明授权
    System and method for flexible multiple protocols 有权
    灵活多协议的系统和方法

    公开(公告)号:US07647433B2

    公开(公告)日:2010-01-12

    申请号:US11844336

    申请日:2007-08-23

    IPC分类号: G06F3/00

    CPC分类号: G06F13/385

    摘要: A system and method for flexible multiple protocols are presented. A device's logical layer may be dynamically configured on a per interface basis to communicate with external devices in a coherent or a non-coherent mode. In coherent mode, commands such as coherency protocol, system commands, and snoop response pass from the device's internal system bus to an external device, thereby creating a logical extension of the devices internal system bus. In non-coherent mode, the input-output bus unit receives commands from the internal system bus and generates non-coherent input-output commands, which are eventually received by an external device.

    摘要翻译: 介绍了灵活多协议的系统和方法。 可以在每个接口的基础上动态地配置设备的逻辑层,以以相干或非相干模式与外部设备进行通信。 在相干模式下,诸如一致性协议,系统命令和侦听响应的命令从设备的内部系统总线传递到外部设备,从而创建设备内部系统总线的逻辑扩展。 在非相干模式下,输入 - 输出总线单元从内部系统总线接收命令,并产生最终由外部设备接收的非相干输入 - 输出命令。

    Methods and apparatus for reducing command processing latency while maintaining coherence
    10.
    发明授权
    Methods and apparatus for reducing command processing latency while maintaining coherence 失效
    减少命令处理延迟同时保持一致性的方法和装置

    公开(公告)号:US08112590B2

    公开(公告)日:2012-02-07

    申请号:US11846697

    申请日:2007-08-29

    IPC分类号: G06F12/00 G06F13/00 G06F13/28

    CPC分类号: G06F12/0804 G06F12/0831

    摘要: In a first aspect, a first method of reducing command processing latency while maintaining memory coherence is provided. The first method includes the steps of (1) providing a memory map including memory addresses available to a system; and (2) arranging the memory addresses into a plurality of groups. At least one of the groups does not require the system, in response to a command that requires access to a memory address in the group from a bus unit, to get permission from all remaining bus units included in the system to maintain memory coherence. Numerous other aspects are provided.

    摘要翻译: 在第一方面,提供了一种在维持存储器一致性的同时降低命令处理等待时间的方法。 第一种方法包括以下步骤:(1)提供包括可用于系统的存储器地址的存储器映射; 和(2)将存储器地址排列成多个组。 响应于需要访问来自总线单元的组中的存储器地址的命令,组中的至少一个不需要系统以从包括在系统中的所有剩余总线单元获得许可以维持存储器一致性。 提供了许多其他方面。