摘要:
An apparatus and method is disclosed for allowing a multiprocessor computer system with shared memory distributed among multiple nodes to appear like a single-node environment. The single-node environment is implemented with a memory map that has a unique address for every memory location in the system. Overlapping address spaces in the multinode environment are also assigned unique representative addresses that are translated to actual addresses in conformance with the multinode environment. The apparatus and method allows a wide variety of operating systems to be run on the multinode environment. Additionally, industry standard BIOS and chip sets can be used.
摘要:
A multiple-stage pipeline for transaction conversion is disclosed. A method is disclosed that converts a transaction into a set of concurrently performable actions. In a first pipeline stage, the transaction is decoded into an internal protocol evaluation (PE) command, such as by utilizing a look-up table (LUT). In a second pipeline stage, an entry within a PE random access memory (RAM) is selected, based on the internal PE command. This may be accomplished by converting the internal PE command into a PE RAM base address and an associated qualifier thereof. In a third pipeline stage, the entry within the PE RAM is converted to the set of concurrently performable actions, such as based on the PE RAM base address and its associate qualifier.
摘要:
A method and apparatus for maintaining processor consistency in a multiprocessor computer such as a multinode computer system are disclosed. A processor proceeds with write operations before its previous write operations complete, while processor consistency is maintained. A write operation begins with a request by the processor to invalidate copies of the data stored in other nodes. This current invalidate request is queued while acknowledging to the processor that the request is complete even though it has not actually completed. The processor proceeds to complete the write operation by changing the data. It can then execute subsequent operations, including other write operations. The queued request, however, is not transmitted to other nodes in the computer until all previous invalidate requests by the processor are complete. This ensures that the current invalidate request will not pass a previous invalidate request. The invalidate requests are added and removed from a processor's outstanding invalidate list as they arise and are completed. An invalidate request is completed by notifying the nodes in a linked list related to the current invalidate request that data shared by the node is now invalid.
摘要:
The temporary storage of a memory line to be stored in a cache while waiting for another memory line to be evicted from the cache is disclosed. A method includes evicting a first memory line currently stored in the cache and storing a second memory line not currently stored in the cache in its place. While the first memory line is being evicted, such as by first being inserted into an eviction queue, the second memory line is temporarily stored in a buffer. The buffer may be a data transfer buffer (DTB). Upon eviction of the first memory line, the second memory line is moved from the buffer into the cache.
摘要:
The management of transactions received by a coherency controller is disclosed. A method of an embodiment of the invention is performed by a coherency controller of a plurality of coherency controllers of a node that has a plurality of sub-nodes. The coherency controller receives a transaction from one of the sub-nodes of the node. The transaction may relate to another sub-node of the node. However, the coherency controller nevertheless processes the transaction without having to send the transaction to another coherency controller of the node, even though the sub-node from which the transaction was received is different than the sub-node to which the transaction relates. The plurality of coherency controllers is thus shared by all of the plurality of sub-nodes of the node.
摘要:
Determining an error-correcting code (ECC) for a cache entry based at least on the data stored in the cache entry and the memory address at which the data is permanently stored is disclosed. A cache entry for a desired memory address is retrieved. The cache entry includes data and a stored ECC based on the data and a memory address. An ECC is determined based at least on the data of the cache entry and the desired memory address. If the ECC at least based on the cache entry data and the desired memory address equals the stored ECC, then the cache entry caches the desired memory address without error.
摘要:
Caching memory contents differently based on the region to which the memory has been partitioned or allocated is disclosed. A first region of a first line of memory to be cached is determined. The memory has a number of regions, including the first region, over which the lines of memory, including the first line, are partitioned. Each region has a first variable having a corresponding second variable. If the first variable for any region is greater than its corresponding second variable, one such region is selected as a second region. A line from the lines of the memory currently stored in the cache and partitioned to the second region is selected as the second line. The second line is replaced with the first line in the cache, the first variable for the second region is decremented, and the first variable for the first region is incremented.
摘要:
A method of invalidating shared cache lines such as on a sharing list by issuing an invalidate acknowledgement before actually invalidating a cache line. The method is useful in multiprocessor systems such as a distributed shared memory (DSM) or non-uniform memory access (NUMA) machines that include a number of interconnected processor nodes each having local memory and caches that store copies of the same data. In such a multiprocessor system using the Scalable Content Interface (SCI) protocol, an invalidate request is sent from the head node on the sharing list to a succeeding node on the list. In response to the invalidate request, the succeeding node issues an invalidate acknowledgement before the cache line is actually invalidated. After issuing the invalidate acknowledgement, the succeeding node initiates invalidation of the cache line. The invalidate acknowledgement can take the form of a response to the head node or a forwarding of the invalidate request to the next succeeding node on the list. To maintain processor consistency, a flag is set each time an invalidate acknowledgement is sent. The flag is cleared after the invalidation of the cache line is completed. Cacheable transactions received at the succeeding node while a flag is set are delayed until the flag is cleared.
摘要:
A multiprocessor system that assures forward progress of local processor requests for data by preventing other nodes from accessing the data until the processor request is satisfied. In one aspect of the invention, the local processor requests data through a remote cache interconnect. The remote cache interconnect tells the local processor to retry its request for data at a later time, so that the remote cache interconnect has sufficient time to obtain the data from the system interconnect. When the remote cache interconnect receives the data from the system interconnect, a hold flag is set. Any requests from other nodes for the data are rejected while the hold flag is set. When the local processor issues a retry request, the data is delivered to the processor and the hold flag is cleared. Other nodes may then obtain control of the data.
摘要:
A method and apparatus for providing cache coherence in a multiprocessor system which is configured into two or more nodes with memory local to each node and a tag and address crossbar system and a data crossbar system which interconnects all nodes. The disclosure is applicable to multiprocessor computer systems which utilize system memory distributed over more than one node and snooping of data states in each node which utilizes memory local to that node. Global snooping is used to provide a single point of serialization of data tags. A central crossbar controller examines cache state tags of a given address line for all nodes simultaneously and issues an appropriate reply back to a node requesting data while generating other data requests to any other node in the system for the purpose of maintaining cache coherence and supplying the requested data. The system utilizes memory local to each node by dividing such memory into local and remote categories which are mutually exclusive for any given cache line. The disclosure provides support for a third level remote cache for each node.