摘要:
A multiprocessing node has a plurality of point-to-point connected microprocessors. Each of the microprocessors is also point-to-point connected to a filter. In response to a local cache miss, a microprocessor issues a broadcast for the requested data to the filter. The filter, using memory that stores a copy of the tags of data stored in the local cache memories of each of the microprocessors, relays the broadcast to those/microprocessors having copies of the requested data. If the snoop filter memory indicates that none of the microprocessors have a copy of the requested data, the snoop filter may either (i) cancel the broadcast and issue a message back to the requesting microprocessor, or (ii) relay the broadcast to a connected multiprocessing node.
摘要:
A method and system for increasing reliability and availability of a multi-processor network. A system includes a network with at least two nodes, with each node comprising a multi-processor unit (mpu) and memory. The mpu includes one or more processors and a wiretap unit. The wiretap unit and the memory included in the node are coupled to the processors in the node. The wiretap unit is configured to monitor memory accesses of the processors and convey data indicative of such accesses to a second node. The second node maintains a replica of memory in the first node, and is configured to undo modifications to the memory if needed. In the event of a hardware or software fault, the nodes are configured to restart the application on another node.
摘要:
A cluster of multiprocessing nodes uses snooping-based cache-coherence to maintain consistency among the cache memories of the multiprocessing nodes. One or more of the multiprocessing nodes each maintain a directory table that includes a list of addresses of data last transferred by cache-to-cache transfer transactions. Thus, upon a local cache miss for requested data, a multiprocessing node searches its directory table for an address of the requested data, and if the address is found in the directory table, the multiprocessing node obtains a copy of the requested data from the last destination of the requested data as indicated in the directory table. Thereafter, a message indicating the completion of a cache-to-cache transfer is broadcast to other connected multiprocessing nodes on a “best efforts” basis in which messages are relayed from multiprocessing node to multiprocessing node using low priority status and/or otherwise unused cycles.
摘要:
A point-to-point connected multiprocessing node uses a snooping-based cache-coherence filter to selectively direct relays of data request broadcasts. The filter includes shadow cache lines that are maintained to hold copies of the local cache lines of integrated circuits connected to the filter. The shadow cache lines are provided with additional entries so that if newly referenced data is added to a particular local cache line by “silently” removing an entry in the local cache line, the newly referenced data may be added to the shadow cache line without forcing the “blind” removal of an entry in the shadow cache line.
摘要:
A proximity interconnect module includes a plurality of processors operatively connected to a plurality of off-chip cache memories by proximity communication. Due to the high bandwidth capability of proximity interconnect, enhancements to the cache protocol to improve latency may be made despite resulting increased bandwidth consumption.
摘要:
A multiprocessing node in a snooping-based cache-coherent cluster of processing nodes maintains a cache-to-cache transfer prediction directory of addresses of data last transferred by cache-to-cache transfers. In response to a local cache miss, the multiprocessing node may use the cache-to-cache transfer prediction directory to predict a cache-to-cache transfer and issue a restricted broadcast for requested data that allows only cache memories in the cluster to return copies of the requested data to the requesting multiprocessing node, thereby reducing the consumption of bandwidth that would otherwise be consumed by having a home memory return a copy of the requested data in response to an unrestricted broadcast for requested data that allows cache memories and home memories in a cluster to return copies of the requested data to the requesting multiprocessing node.
摘要:
A shared cache is point-to-point connected to a plurality of point-to-point connected processing nodes, wherein the processing nodes may be integrated circuits or multiprocessing systems. In response to a local cache miss, a requesting processing node issues a broadcast for requested data which is observed by the shared cache. If the shared cache has a copy of the requested data, the shared cache forwards the copy of the requested data to the requesting processing node.
摘要:
A method and system for increasing programmability and scalability of a multi-processor network. A system includes two or more nodes coupled via a network with each node comprising a processor unit and memory. The processor unit includes one or more processors and a wiretap unit. The wiretap unit is configured to monitor memory accesses of the processors. A transaction may execute a number of read and/or write operations to memory. The nodes are configured to replicate one or more portions of memory; detect data conflicts to memory; and restore memory to pre-transaction state if needed.
摘要:
A proximity interconnect module includes a plurality of off-chip cache memories. Either disposed external to the proximity interconnect module or on the proximity interconnect module are a plurality of processors that are dependent on the plurality of off-chip cache memories for servicing requests for data. The plurality of off-chip cache memories are operatively connected to either one another or to one or more of the plurality of processors by proximity communication. Each of the plurality of off-chip cache memories may cache certain portions of the physical address space.
摘要:
A method and system for increasing reliability and availability of a multi-processor network. A system includes a network with at least two nodes, with each node comprising a multi-processor unit (mpu) and memory. The mpu includes one or more processors and a wiretap unit. The wiretap unit and the memory included in the node are coupled to the processors in the node. The wiretap unit is configured to monitor memory accesses of the processors and convey data indicative of such accesses to a second node. The second node maintains a replica of memory in the first node, and is configured to undo modifications to the memory if needed. In the event of a hardware or software fault, the nodes are configured to restart the application on another node.