摘要:
A method is provided for operating a communications adapter employed in a multinode data processing system in a fashion which enhances the performance of remote direct memory access data transfers. The system is provided with pointers and a table which are employed to determine whether or not an address which has been supplied for the transfer has already been mapped to a real address at the source or destination node. The table is also preferably provided with counters which can be incremented or decremented to enable the use of least recently used mechanisms at the upper level protocol layers to more efficiently control the setting and resetting of table entries.
摘要:
In remote direct memory access (RDMA) transfers in a multinode data processing system in which the nodes communicate with one another through communication adapters coupled to a switch or network, there is a need for the system to ensure efficient memory protection mechanisms across jobs. A method is thus desired for addressing virtual memory on local and remote servers that is independent of the process ID on the local and/or remote node. The use of global Translation Control Entry (TCE) tables that are accessed/owned by RDMA jobs and are managed by a device driver in conjunction with a Protocol Virtual Offset (PVO) address format solves this problem.
摘要:
In remote direct memory access transfers in a multinode data processing system in which the nodes communicate with one another through communication adapters coupled to a switch or network, failures in the nodes or in the communication adapters can produce the phenomenon known as trickle traffic, which is data that has been received from the switch or from the network that is stale but which may have all the signatures of a valid packet data. The present invention addresses the trickle traffic problem in two situations: node failure and adapter failure. In the node failure situation randomly generated keys are used to reestablish connections to the adapter while providing a mechanism for the recognition of stale packets. In the adapter failure situation, a round robin context allocation approach is used with adapter state contexts being provided with state information which helps to identify stale packets.
摘要:
In a multinode data processing system in which nodes exchange information over a network or through a switch, a structure and mechanism is provided within the realm of Remote Direct Memory Access (RDMA) operations in which DMA operations are present on one side of the transfer but not the other. On the side in which the transfer is not carried out in DMA fashion, transfer processing is carried out under program control; this is in contrast to the transfer on the DMA side which is characteristically carried out in hardware. Usage of these combination processes is useful in programming situations where RDMA is carried out to or from contiguous locations in memory on one side and where memory locations on the other side is noncontiguous. This split mode of transfer is provided both for read and for write operations.
摘要:
A data processing system includes a mechanism for completing an asynchronous memory move (AMM) operation in which the processor receives an AMM ST instruction and processes a processor-level move of data in virtual address space and an asynchronous memory mover then completes a physical move of the data within the real address space (memory). A status/control field of the AMM ST instruction includes an indication of a requested treatment of the lower level cache(s) on completion of the AMM operation. When the status/control field indicates an update to at least one cache should be performed, the asynchronous memory mover automatically forwards a copy of the data from the data move to the lower level cache, and triggers an update of a coherency state for a cache line in which the copy of the data is placed.
摘要:
A method and data processing system for performing fence operations within a global shared memory (GSM) environment having a local task executing on a processor and providing GSM commands for processing by a host fabric interface (HFI) window that is allocated to the task. The HFI window has one or more registers for use during local fence operations. A first register tracks a first count of task-issued GSM commands, and a second register tracks a second count of GSM operations being processed by the HFI. The processing logic detects a locally-issued fence operation, and responds by performing a series of operations, including: automatically stopping the task from issuing additional GSM commands; monitoring for completion of all the task-issued GSM commands at the HFI; and triggering a resumption of issuance of GSM commands by the task when the completion of all previous task-issued GSM commands is registered by the HFI.
摘要:
Epoch numbers are maintained in a pair wise fashion at a plurality of communication endpoints to provide communication consistency and recovery from a range of failure conditions including total or partial node failure and subsequent recovery. Once an epoch state inconsistency is recognized, negotiation procedures provide an effective mechanism to reestablish valid communication links without the need to employ global variables which inherently possess greater transmission and overhead requirements needed to maintain communications. Renegotiation of recognizably valid epoch numbers occurs on a pair wise basis.
摘要:
A data processing system has a processor and a memory coupled to the processor and an asynchronous memory mover coupled to the processor. The asynchronous memory mover has registers for receiving a set of parameters from the processor, which parameters are associated with an asynchronous memory move (AMM) operation initiated by the processor in virtual address space, utilizing a source effective address and a destination effective address. The asynchronous memory mover performs the AMM operation to move the data from a first physical memory location having a source real address corresponding to the source effective address to a second physical memory location having a destination real address corresponding to the destination effective address. The asynchronous memory mover has an associated off-chip translation mechanism. The AMM operation thus occurs independent of the processor, and the processor continues processing other operations independent of the AMM operation.
摘要:
A data processing system has a processor, a memory, and an instruction set architecture (ISA) that includes: an asynchronous memory mover (AMM) store (ST) instruction that initiates an asynchronous memory move operation that moves data from a first memory location having a first real address to a second memory location having a second real address by: (a) first performing a move of the data in virtual address space utilizing a source effective address a destination effective address; and (b) when the move is completed, completing a physical move of the data to the second memory location, independent of the processor. The ISA further provides an AMM terminate ST instruction for stopping an ongoing AMM operation before completion of the AMM operation, and a LD CMP instruction for checking a status of an AMM operation.
摘要:
A method of operating a data processing system includes each of multiple tasks within a parallel job executing on multiple nodes of the data processing system issuing a respective system call to request reservation, without allocation of backing storage in physical memory, of a global address space defined by a range of effective addresses as global shared memory accessible to all of the multiple tasks within the parallel job. At least two of the tasks within the parallel job allocate global address spaces including a same effective address.