Memory sharing across distributed nodes

    公开(公告)号:US09679084B2

    公开(公告)日:2017-06-13

    申请号:US13828983

    申请日:2013-03-14

    CPC classification number: G06F17/30994

    Abstract: A method and apparatus are disclosed for enabling nodes in a distributed system to share one or more memory portions. A home node makes a portion of its main memory available for sharing, and one or more sharer nodes mirrors that shared portion of the home node's main memory in its own main memory. To maintain memory coherency, a memory coherence protocol is implemented. Under this protocol, a special data value is used to indicate that data in a mirrored memory location is not valid. This enables a sharer node to know when to obtain valid data from a home node. With this protocol, valid data is obtained from the home node and updates are propagated to the home node. Thus, no “dirty” data is transferred between sharer nodes. Consequently, the failure of one node will not cause the failure of another node or the failure of the entire system.

    LOAD-MONITOR MWAIT
    13.
    发明申请

    公开(公告)号:US20160098274A1

    公开(公告)日:2016-04-07

    申请号:US14967954

    申请日:2015-12-14

    Abstract: Techniques are disclosed relating to suspending execution of a processor thread while monitoring for a write to a specified memory location. An execution subsystem may be configured to perform a load instruction that causes the processor to retrieve data from a specified memory location and atomically begin monitoring for a write to the specified location. The load instruction may be a load-monitor instruction. The execution subsystem may be further configured to perform a wait instruction that causes the processor to suspend execution of a processor thread during at least a portion of an interval specified by the wait instruction and to resume execution of the processor thread at the end of the interval. The wait instruction may be a monitor-wait instruction. The processor may be further configured to resume execution of the processor thread in response to detecting a write to a memory location specified by a previous monitor instruction.

    Abstract translation: 公开了关于在监视对指定的存储器位置的写入时暂停执行处理器线程的技术。 执行子系统可以被配置为执行加载指令,其使处理器从指定的存储器位置检索数据,并且原子地开始监视对指定位置的写入。 加载指令可以是加载监视器指令。 执行子系统还可以被配置为执行等待指令,该等待指令使处理器在由等待指令指定的间隔的至少一部分期间暂停处理器线程的执行,并且在间隔结束时继续执行处理器线程 。 等待指令可以是监视等待指令。 响应于检测到对由先前监视指令指定的存储器位置的写入,处理器还可被配置为恢复处理器线程的执行。

    Monitoring multiple memory locations for targeted stores in a shared-memory multiprocessor
    14.
    发明授权
    Monitoring multiple memory locations for targeted stores in a shared-memory multiprocessor 有权
    监控共享内存多处理器中目标存储的多个内存位置

    公开(公告)号:US08990503B2

    公开(公告)日:2015-03-24

    申请号:US13754700

    申请日:2013-01-30

    Abstract: A system and method for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. More specifically, the disclosed embodiments provide a system that notifies a waiting thread when a targeted store is directed to monitored memory locations. During operation, the system receives a targeted store which is directed to a specific cache in a shared-memory multiprocessor system. In response, the system examines a destination address for the targeted store to determine whether the targeted store is directed to a monitored memory location which is being monitored for a thread associated with the specific cache. If so, the system informs the thread about the targeted store.

    Abstract translation: 用于在共享存储器多处理器中支持目标存储的系统和方法。 目标商店使得第一处理器能够将要存储在第二处理器的高速缓冲存储器中的高速缓存行推送。 这消除了对多个高速缓存相干操作的需要,以将高速缓存行从第一处理器传送到第二处理器。 更具体地,所公开的实施例提供了当目标商店被引导到被监视的存储器位置时通知等待线程的系统。 在操作期间,系统接收目标商店,其被定向到共享存储器多处理器系统中的特定高速缓存。 作为响应,系统检查目标商店的目的地地址,以确定目标商店是否被引导到被监视的与特定高速缓存相关联的线程的监视的存储器位置。 如果是这样,系统通知线程有关目标商店。

    BROADCAST CACHE COHERENCE ON PARTIALLY-ORDERED NETWORK
    15.
    发明申请
    BROADCAST CACHE COHERENCE ON PARTIALLY-ORDERED NETWORK 有权
    BROADCAST CACHE关于部分网络的协调

    公开(公告)号:US20140281237A1

    公开(公告)日:2014-09-18

    申请号:US13830967

    申请日:2013-03-14

    CPC classification number: G06F12/0808 G06F12/0811 G06F12/0817 G06F12/0828

    Abstract: A method for cache coherence, including: broadcasting, by a requester cache (RC) over a partially-ordered request network (RN), a peer-to-peer (P2P) request for a cacheline to a plurality of slave caches; receiving, by the RC and over the RN while the P2P request is pending, a forwarded request for the cacheline from a gateway; receiving, by the RC and after receiving the forwarded request, a plurality of responses to the P2P request from the plurality of slave caches; setting an intra-processor state of the cacheline in the RC, wherein the intra-processor state also specifies an inter-processor state of the cacheline; and issuing, by the RC, a response to the forwarded request after setting the intra-processor state and after the P2P request is complete; and modifying, by the RC, the intra-processor state in response to issuing the response to the forwarded request.

    Abstract translation: 一种用于高速缓存一致性的方法,包括:通过部分有序请求网络(RN)的请求者缓存(RC)广播对多个从高速缓存的高速缓存线的对等(P2P)请求; 当P2P请求正在等待时,由RC接收和通过RN接收来自网关的高速缓存行的转发请求; 由所述RC接收所述转发请求后,从所述多个从属高速缓存中接收对所述P2P请求的多个响应; 在所述RC中设置所述高速缓存行的处理器内状态,其中所述处理器内状态还指定所述高速缓存行的处理器间状态; 以及在设置处理器内状态之后,在P2P请求完成之后,由RC发出对转发请求的响应; 以及响应于发出对所转发的请求的响应,由RC修改处理器内状态。

    SUPPORTING TARGETED STORES IN A SHARED-MEMORY MULTIPROCESSOR SYSTEM
    16.
    发明申请
    SUPPORTING TARGETED STORES IN A SHARED-MEMORY MULTIPROCESSOR SYSTEM 有权
    在共享存储器多处理器系统中支持目标存储

    公开(公告)号:US20140089591A1

    公开(公告)日:2014-03-27

    申请号:US13625700

    申请日:2012-09-24

    CPC classification number: G06F9/50 G06F9/5066 G06F9/544 G06F12/0888

    Abstract: The present embodiments provide a system for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor in the shared-memory multiprocessor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. The system includes an interface, such as an application programming interface (API), and a system call interface or an instruction-set architecture (ISA) that provides access to a number of mechanisms for supporting targeted stores. These mechanisms include a thread-location mechanism that determines a location near where a thread is executing in the shared-memory multiprocessor, and a targeted-store mechanism that targets a store to a location (e.g., cache memory) in the shared-memory multiprocessor.

    Abstract translation: 本实施例提供一种用于在共享存储器多处理器中支持目标存储的系统。 目标商店使得第一处理器能够将存储在共享存储器多处理器中的第二处理器的高速缓冲存储器中的高速缓存行推送。 这消除了对多个高速缓存相干操作的需要,以将高速缓存行从第一处理器传送到第二处理器。 该系统包括诸如应用编程接口(API)的接口以及提供对用于支持目标商店的多种机制的访问的系统调用接口或指令集架构(ISA)。 这些机制包括一个线程定位机制,它确定线程在共享存储器多处理器中执行的位置附近的位置,以及将存储器定位到共享存储器多处理器中的位置(例如高速缓冲存储器)的目标存储机制 。

    Fault-tolerant cache coherence over a lossy network

    公开(公告)号:US10467139B2

    公开(公告)日:2019-11-05

    申请号:US15859037

    申请日:2017-12-29

    Abstract: A cache coherence system manages both internode and intranode cache coherence in a cluster of nodes. Each node in the cluster of nodes is either a collection of processors running an intranode coherence protocol between themselves, or a single processor. A node comprises a plurality of coherence ordering units (COUs) that are hardware circuits configured to manage intranode coherence of caches within the node and/or internode coherence with caches on other nodes in the cluster. Each node contains one or more directories which tracks the state of cache line entries managed by the particular node. Each node may also contain one or more scoreboards for managing the status of ongoing transactions. The internode cache coherence protocol implemented in the COUs may be used to detect and resolve communications errors, such as dropped message packets between nodes, late message delivery at a node, or node failure. Additionally, a transport layer manages communication between the nodes in the cluster, and can additionally be used to detect and resolve communications errors.

    Fault-tolerant cache coherence over a lossy network

    公开(公告)号:US10452547B2

    公开(公告)日:2019-10-22

    申请号:US15858787

    申请日:2017-12-29

    Abstract: A cache coherence system manages both internode and intranode cache coherence in a cluster of nodes. Each node in the cluster of nodes is either a collection of processors running an intranode coherence protocol between themselves, or a single processor. A node comprises a plurality of coherence ordering units (COUs) that are hardware circuits configured to manage intranode coherence of caches within the node and/or internode coherence with caches on other nodes in the cluster. Each node contains one or more directories which tracks the state of cache line entries managed by the particular node. Each node may also contain one or more scoreboards for managing the status of ongoing transactions. The internode cache coherence protocol implemented in the COUs may be used to detect and resolve communications errors, such as dropped message packets between nodes, late message delivery at a node, or node failure. Additionally, a transport layer manages communication between the nodes in the cluster, and can additionally be used to detect and resolve communications errors.

    ROBUST PIN-CORRECTING ERROR-CORRECTING CODE

    公开(公告)号:US20180115327A1

    公开(公告)日:2018-04-26

    申请号:US15458408

    申请日:2017-03-14

    Abstract: The disclosed embodiments provide a memory system that provides error detection and correction. Each block of data in the memory system includes an array of bits logically organized into R rows and C columns, including C-M-1 data-bit columns containing data bits, a row check bit column including row-parity bits for each of the R rows in the block, and M inner check bit columns that collectively include MR inner check bits. These inner check bits are defined to cover bits in the array in accordance with a set of check vectors, wherein each check vector is associated with a different bit in the array and is an element of Res(P), a residue system comprising a set of polynomials with GF(2) coefficients modulo a polynomial P with GF(2) coefficients, wherein each column is associated with a different pin in a memory module interface, and wherein the check bits are generated from the data bits to facilitate block-level detection and correction for errors that arise during the transmission. During operation, the system transmits a block of data from the memory. Next, the system uses an error-detection circuit to examine the block of data, and determine whether an error has occurred during the transmission based on the examination.

    Supporting targeted stores in a shared-memory multiprocessor system
    20.
    发明授权
    Supporting targeted stores in a shared-memory multiprocessor system 有权
    在共享内存多处理器系统中支持目标存储

    公开(公告)号:US09110718B2

    公开(公告)日:2015-08-18

    申请号:US13625700

    申请日:2012-09-24

    CPC classification number: G06F9/50 G06F9/5066 G06F9/544 G06F12/0888

    Abstract: The present embodiments provide a system for supporting targeted stores in a shared-memory multiprocessor. A targeted store enables a first processor to push a cache line to be stored in a cache memory of a second processor in the shared-memory multiprocessor. This eliminates the need for multiple cache-coherence operations to transfer the cache line from the first processor to the second processor. The system includes an interface, such as an application programming interface (API), and a system call interface or an instruction-set architecture (ISA) that provides access to a number of mechanisms for supporting targeted stores. These mechanisms include a thread-location mechanism that determines a location near where a thread is executing in the shared-memory multiprocessor, and a targeted-store mechanism that targets a store to a location (e.g., cache memory) in the shared-memory multiprocessor.

    Abstract translation: 本实施例提供一种用于在共享存储器多处理器中支持目标存储的系统。 目标商店使得第一处理器能够将存储在共享存储器多处理器中的第二处理器的高速缓冲存储器中的高速缓存行推送。 这消除了对多个高速缓存相干操作的需要,以将高速缓存行从第一处理器传送到第二处理器。 该系统包括诸如应用编程接口(API)的接口以及提供对用于支持目标商店的多种机制的访问的系统调用接口或指令集架构(ISA)。 这些机制包括一个线程定位机制,它确定线程在共享存储器多处理器中执行的位置附近的位置,以及将存储器定位到共享存储器多处理器中的位置(例如高速缓冲存储器)的目标存储机制 。

Patent Agency Ranking