摘要:
Embodiments of the invention provide methods and systems for reducing the consumption of inter-node bandwidth by communications maintaining coherence between accelerators and CPUs. The CPUs and the accelerators may be clustered on separate nodes in a multiprocessing environment. Each node that contains a shared memory device may maintain a directory to track blocks of shared memory that may have been cached at other nodes. Therefore, commands and addresses may be transmitted to processors and accelerators at other nodes only if a memory location has been cached outside of a node. Additionally, because accelerators generally do not access the same data as CPUs, only initial read, write, and synchronization operations may be transmitted to other nodes. Intermediate accesses to data may be performed non-coherently. As a result, the inter-chip bandwidth consumed for maintaining coherence may be reduced.
摘要:
Embodiments of the invention provide methods and systems for reducing the consumption of inter-node bandwidth by communications maintaining coherence between accelerators and CPUs. The CPUs and the accelerators may be clustered on separate nodes in a multiprocessing environment. Each node that contains a shared memory device may maintain a directory to track blocks of shared memory that may have been cached at other nodes. Therefore, commands and addresses may be transmitted to processors and accelerators at other nodes only if a memory location has been cached outside of a node. Additionally, because accelerators generally do not access the same data as CPUs, only initial read, write, and synchronization operations may be transmitted to other nodes. Intermediate accesses to data may be performed non-coherently. As a result, the inter-chip bandwidth consumed for maintaining coherence may be reduced.
摘要:
A method and apparatus are provided for efficiently managing limited resources is a given computer system. The system utilizes a token manager that assigns tokens to groups of associated requestors. The tokens are then utilized by the requesters to occupy the given resource. The allocation of these tokens, thus, prevents such problems as denial of service due to a lack of available resources.
摘要:
Embodiments of the invention provide methods and systems for reducing the consumption of inter-node bandwidth by communications maintaining coherence between accelerators and CPUs. The CPUs and the accelerators may be clustered on separate nodes in a multiprocessing environment. Each node that contains a shared memory device may maintain a directory to track blocks of shared memory that may have been cached at other nodes. Therefore, commands and addresses may be transmitted to processors and accelerators at other nodes only if a memory location has been cached outside of a node. Additionally, because accelerators generally do not access the same data as CPUs, only initial read, write, and synchronization operations may be transmitted to other nodes. Intermediate accesses to data may be performed non-coherently. As a result, the inter-chip bandwidth consumed for maintaining coherence may be reduced.
摘要:
In a first aspect, a first method of testing a link between a first chip and a second chip is provided. The first method includes the steps of, while operating in a test mode, (1) transmitting test data of sufficient length to enable exercising of worst case transitions from the first chip to the second chip via the link; and (2) performing cyclic redundancy checking (CRC) on the test data to test the link. Numerous other aspects are provided.
摘要:
Methods and apparatus provide for interconnecting one or more multiprocessors and one or more external devices through one or more configurable interface circuits, which are adapted for operation in: (i) a first mode to provide a coherent symmetric interface; or (ii) a second mode to provide a non-coherent interface.
摘要:
Methods and apparatus provide for sending a data command from a first of a plurality of devices to a first address concentrator within a first of a plurality of processing systems; selecting one of the other processing systems, the selected processing system having data addressed by the data command stored therein; sending the data command to a first address concentrator of the selected processing system; and broadcasting the data command from the first address concentrator of the selected processing system to a second address concentrator in each of the processing systems.
摘要:
A method for maintaining cache coherency for a multi-node system using a specialized bridge which allows for fewer forward progress dependencies. A local node makes a determination whether a request is a local or system request. If the request is a local request, a look-up of a directory in the local node is performed. If an entry in the directory of the local node indicates that data in the request does not have a remote owner and that the request does not have a remote destination, the coherency of the data is resolved on the local node, and a transfer of the data specified in the request is performed if required and if the request is a local request. If the entry indicates that the data has a remote owner or that the request has a remote destination, the request is forwarded to all remote nodes in the multi-node system.
摘要:
A system and method for flexible multiple protocols are presented. A device's logical layer may be dynamically configured on a per interface basis to communicate with external devices in a coherent or a non-coherent mode. In coherent mode, commands such as coherency protocol, system commands, and snoop response pass from the device's internal system bus to an external device, thereby creating a logical extension of the devices internal system bus. In non-coherent mode, the input-output bus unit receives commands from the internal system bus and generates non-coherent input-output commands, which are eventually received by an external device.
摘要:
In a first aspect, a first method of reducing command processing latency while maintaining memory coherence is provided. The first method includes the steps of (1) providing a memory map including memory addresses available to a system; and (2) arranging the memory addresses into a plurality of groups. At least one of the groups does not require the system, in response to a command that requires access to a memory address in the group from a bus unit, to get permission from all remaining bus units included in the system to maintain memory coherence. Numerous other aspects are provided.