FAULT-TOLERANT CACHE COHERENCE OVER A LOSSY NETWORK

    公开(公告)号:US20190205252A1

    公开(公告)日:2019-07-04

    申请号:US15858787

    申请日:2017-12-29

    Abstract: A cache coherence system manages both internode and intranode cache coherence in a cluster of nodes. Each node in the cluster of nodes is either a collection of processors running an intranode coherence protocol between themselves, or a single processor. A node comprises a plurality of coherence ordering units (COUs) that are hardware circuits configured to manage intranode coherence of caches within the node and/or internode coherence with caches on other nodes in the cluster. Each node contains one or more directories which tracks the state of cache line entries managed by the particular node. Each node may also contain one or more scoreboards for managing the status of ongoing transactions. The internode cache coherence protocol implemented in the COUs may be used to detect and resolve communications errors, such as dropped message packets between nodes, late message delivery at a node, or node failure. Additionally, a transport layer manages communication between the nodes in the cluster, and can additionally be used to detect and resolve communications errors.

    MEMORY SHARING ACROSS DISTRIBUTED NODES
    3.
    发明申请
    MEMORY SHARING ACROSS DISTRIBUTED NODES 有权
    存储器分配的分发节点

    公开(公告)号:US20140279894A1

    公开(公告)日:2014-09-18

    申请号:US13828983

    申请日:2013-03-14

    CPC classification number: G06F17/30994

    Abstract: A method and apparatus are disclosed for enabling nodes in a distributed system to share one or more memory portions. A home node makes a portion of its main memory available for sharing, and one or more sharer nodes mirrors that shared portion of the home node's main memory in its own main memory. To maintain memory coherency, a memory coherence protocol is implemented. Under this protocol, a special data value is used to indicate that data in a mirrored memory location is not valid. This enables a sharer node to know when to obtain valid data from a home node. With this protocol, valid data is obtained from the home node and updates are propagated to the home node. Thus, no “dirty” data is transferred between sharer nodes. Consequently, the failure of one node will not cause the failure of another node or the failure of the entire system.

    Abstract translation: 公开了一种用于使分布式系统中的节点共享一个或多个存储器部分的方法和装置。 家庭节点使其主存储器的一部分可用于共享,并且一个或多个共享节点在其自己的主存储器中镜像家庭节点的主存储器的共享部分。 为了保持内存一致性,实现了内存一致性协议。 在该协议下,使用特殊数据值来指示镜像存储器位置中的数据无效。 这使得共享节点能够知道何时从家庭节点获取有效数据。 使用该协议,从家庭节点获得有效数据,并将更新传播到家庭节点。 因此,在共享器节点之间不传输“脏”数据。 因此,一个节点的故障不会导致另一个节点的故障或整个系统的故障。

Patent Agency Ranking