摘要:
In a chip multiprocessor system, the coherence protocol is split into two cooperating protocols implemented by different hardware modules. One protocol is responsible for cache coherence management within the chip, and is implemented by a second-level cache controller. The other protocol is responsible for cache coherence management across chip multiprocessor nodes, and is implemented by separate cache coherence protocol engines. The cache controller and the protocol engine within each node communicate and synchronize memory transactions involving multiple nodes to maintain cache coherence within and across the nodes. The present invention addresses race conditions that arise during this communication and synchronization.
摘要:
The present invention relates generally to a protocol engine for use in a multiprocessor computer system. The protocol engine, which implements a cache coherence protocol, includes a clock signal generator for generating signals denoting interleaved even clock periods and odd clock periods, a memory transaction state array for storing entries, each denoting the state of a respective memory transaction, and processing logic. The memory transactions are divided into even and odd transactions whose states are stored in distinct sets of entries in the memory transaction state array. The processing logic has interleaving circuitry for processing during even clock periods the even memory transactions and for processing during odd clock periods the odd memory transactions.
摘要:
A computer system has a plurality of processor nodes and a plurality of input/output nodes. Each processor node includes a multiplicity of processor cores, an interface to a local memory system and a protocol engine implementing a predefined cache coherence protocol. Each processor core has an associated memory cache for caching memory lines of information. Each input/output node includes no processor cores, an input/output interface for interfacing to an input/output bus or input/output device, a memory cache for caching memory lines of information and an interface to a local memory subsystem. The local memory subsystem of each processor node and input/output node stores a multiplicity of memory lines of information. The protocol engine of each processor node and input/output node implements the same predefined cache coherence protocol.
摘要:
Each node of a multiprocessor computer system includes a main memory, a cache memory system and logic. The main memory stores memory lines of data. A directory entry for each memory line indicates whether a copy of the corresponding memory line is stored in the cache memory system in another node. The cache memory system stores copies of memory lines and cache state information indicating whether the cached copy of each memory line is an exclusive copy. The logic of each respective node is configured to respond to a transaction request for a particular memory line and its corresponding directory entry, where the respective node is the home node of the particular memory. When the cache memory system of the home node stores an exclusive copy of the particular memory line, the logic responds to the request by sending the copy of the particular memory line retrieved from the cache memory system and a predefined null directory entry value, and thus does not retrieve the memory line and its directory entry from the main memory of the home node.
摘要:
A protocol engine is for use in each node of a computer system having a plurality of nodes. Each node includes an interface to a local memory subsystem that stores memory lines of information, a directory, and a memory cache. The directory includes an entry associated with a memory line of information stored in the local memory subsystem. The directory entry includes an identification field for identifying sharer nodes that potentially cache the memory line of information. The identification field has a plurality of bits at associated positions within the identification field. Each respective bit of the identification field is associated with one or more nodes. The protocol engine furthermore sets each bit in the identification field for which the memory line is cached in at least one of the associated nodes. In response to a request for exclusive ownership of a memory line, the protocol engine sends an initial invalidation request to no more than a first predefined number of the nodes associated with set bits in the identification field of the directory entry associated with the memory line.
摘要:
A protocol engine is for use in each node of a computer system having a plurality of nodes. Each node includes an interface to a local memory subsystem that stores memory lines of information, a directory, and a memory cache. The directory includes an entry associated with a memory line of information stored in the local memory subsystem. The directory entry includes an identification field for identifying sharer nodes that potentially cache the memory line of information. The identification field has a plurality of bits at associated positions within the identification field. Each respective bit of the identification field is associated with one or more nodes. The protocol engine furthermore sets each bit in the identification field for which the memory line is cached in at least one of the associated nodes. In response to a request for exclusive ownership of a memory line, the protocol engine sends an initial invalidation request to no more than a first predefined number of the nodes associated with set bits in the identification field of the directory entry associated with the memory line.
摘要:
A computer system has a plurality of processor nodes and a plurality of input/output nodes. Each processor node includes a multiplicity of processor cores, an interface to a local memory system and a protocol engine implementing a predefined cache coherence protocol. Each processor core has an associated memory cache for caching memory lines of information. Each input/output node includes no processor cores, an input/output interface for interfacing to an input/output bus or input/output device, a memory cache for caching memory lines of information and an interface to a local memory subsystem. The local memory subsystem of each processor node and input/output node stores a multiplicity of memory lines of information. The protocol engine of each processor node and input/output node implements the same predefined cache coherence protocol.
摘要:
The present invention relates generally to multiprocessor computer system, and particularly to a multiprocessor system designed to be highly scalable, using efficient cache coherence logic and methodologies. More specifically, the present invention is a system and method including a plurality of processor nodes configured to execute a cache coherence protocol that avoids the use of negative acknowledgment messages (NAKs) and ordering requirements on the underlying transaction-message interconnect/network and services most 3-hop transactions with only a single visit to the home node.
摘要:
The present invention relates generally to a protocol engine for use in a multiprocessor computer system. The protocol engine, which implements a cache coherence protocol, includes a clock signal generator for generating signals denoting interleaved even clock periods and odd clock periods, a memory transaction state array for storing entries, each denoting the state of a respective memory transaction, and processing logic. The memory transactions are divided into even and odd transactions whose states are stored in distinct sets of entries in the memory transaction state array. The processing logic has interleaving circuitry for processing during even clock periods the even memory transactions and for processing during odd clock periods the odd memory transactions. Moreover, the protocol engine is configured to transition from one memory transaction to another in a minimum number of clock cycles. This design improves efficiency for processing commercial workloads, such as on-line transaction processing (OLTP) by taking certain steps in parallel.