摘要:
A method, apparatus and program product for detecting a communication event in a distributed parallel data processing system in which a message is sent from an origin to a target. A low-level application programming interface (LAPI) is provided which has an operation for associating a counter with a communication event to be detected. The LAPI increments the counter upon the occurrence of the communication event. The number in the counter is monitored, and when the number increases, the event is detected. A completion counter in the origin is associated with the completion of a message being sent from the origin to the target. When the message is completed, LAPI increments the completion counter such that monitoring the completion counter detects the completion of the message. The completion counter may be used to insure that a first message has been sent from the origin to the target and completed before a second message is sent.
摘要:
Method, apparatus and program product for communicating from a node to a communications device. A Hardware Abstraction Layer (HAL) provides functions which can be called from user space in a node to access the communications device. An instance of HAL is created in the node. Device specific characteristics from the communications device and a pointer pointing to HAL functions for accessing the communications device are obtained by HAL. HAL then opens multiple ports on the communications device using the functions pointed to by the pointer, and messages are sent between the node and the communications device. The messages thus sent are optimized with respect to the communications device as determined by the obtained device specific characteristics. Multiple processes and protocol stacks may be associated with each port in a single instance of HAL. A further embodiment provides that multiple virtual ports may be associated with a port, with a multiple protocol stacks associated with each virtual port. A further embodiment provides that multiple communications devices may be associated with a single instance of HAL.
摘要:
A method, apparatus and program product for message communication in a distributed parallel data processing system. A user message is sent from a sender to a receiver. The user message contains user data and a pointer to a header handler routine. The header handler routine includes a first pointer to a target user buffer and a second pointer to a completion routine. When the user message is received, a low level application program interface (LAPI) is informed which invokes the header handler routines which returns the first and second pointers. LAPI then transfers the user data to the user buffer indicated by the header handler routine, and invokes the completion routine indicated by the header handler routine to complete the transfer of the user message to the receiver.
摘要:
An efficient mechanism for sending messages without the use of intermediate copies (i.e. without the staging of data) is provided. In particular an interface specification which allows use users of a transport protocol is defined so as to lend itself to efficient implementations. The interface specification is a complete and robust set of user functions usable within systems desiring reliable and efficient zero copy transport protocols. Two methods are provided to accomplish the implementation of an efficient zero copy protocol. The first method is especially useful in systems where the network device has limited capabilities in terms of hardware, message fragmentation and message reassembly. An additional RDRAM memory allows data to reside in an adapter while handshake operations take place between an adapter and a node so as to specify the final destination of the data. The second method takes advantage of network devices with advanced features which are exploited for maximum efficiency.
摘要:
An efficient mechanism for sending messages without the use of intermediate copies (i.e. without the staging of data) is provided. In particular an interface specification which allows use users of a transport protocol is defined so as to lend itself to efficient implementations. The interface specification is a complete and robust set of user functions usable within systems desiring reliable and efficient zero copy transport protocols. Two methods are provided to accomplish the implementation of an efficient zero copy protocol. The first method is especially useful in systems where the network device has limited capabilities in terms of hardware, message fragmentation and message reassembly. An additional RDRAM memory allows data to reside in an adapter while handshake operations take place between an adapter and a node so as to specify the final destination of the data. The second method takes advantage of network devices with advanced features which are exploited for maximum efficiency.
摘要:
In order to solve the problem of the detection of the arrival of duplicate data packets in an interconnected, multinode data processing system, each data packet is provided with a field of r bits that are randomly generated for each data packet. However, one of the packets is provided with a field that is computed from the other randomly generated field entries in a checksum computation which yields a selected nonzero checksum value. A running checksum at the receiver is used to determine whether or not, after the receipt of the specified number, k, of data packets, a duplicate packet has been received.
摘要:
In order to solve the problem of the detection of the arrival of duplicate data packets in an interconnected, multinode data processing system, each data packet is provided with a field of r bits that are randomly generated for each data packet. However, one of the packets is provided with a field that is computed from the other randomly generated field entries in a checksum computation which yields a selected nonzero checksum value. A running checksum at the receiver is used to determine whether or not, after the receipt of the specified number, k, of data packets, a duplicate packet has been received.
摘要:
In a multinode data processing system in which nodes exchange information over a network or through a switch, a structure and mechanism is provided within the realm of Remote Direct Memory Access (RDMA) operations in which DMA operations are present on one side of the transfer but not the other. On the side in which the transfer is not carried out in DMA fashion, transfer processing is carried out under program control; this is in contrast to the transfer on the DMA side which is characteristically carried out in hardware. Usage of these combination processes is useful in programming situations where RDMA is carried out to or from contiguous locations in memory on one side and where memory locations on the other side is noncontiguous. This split mode of transfer is provided both for read and for write operations.
摘要:
In a transmission protocol in which a user running an application in an address space in one data processing system wishes to transmit a data packet to another address space in another data processing system by means of direct memory access directly from a sending buffer to a receiving buffer with no copy, a mechanism is provided for minimizing the need for retransmission and for insuring proper entry into the target data processing system address space. In particular, when the first system does not receive an acknowledgment from the receiver, a special data packet with a retransmit flag bit set is sent to the second system. When this system receives the data packet with the retransmit flag bit set the second system responds either by sending a new acknowledgment or by sending a request for retransmission. No transmission back to the first system occurs, however before such a request is made and in fact the receiving system does not send this retransmission request without insuring that its receipt would be appropriate. In particular, the second system, before requesting retransmission, checks to assure that tag association is still valid so that an adapter at the second system is still capable of matching tags in data packet headers with appropriate real address memory locations within address spaces belonging to the second receiving data processing system. In this manner needless retransmission of packets does not occur and retransmission occurs only when receipt of the data packet is appropriate.
摘要:
A method is provided for transferring data between first and second nodes of a network. Such method includes requesting first data to be transferred by a first upper layer protocol (ULP) operating on the first node of the network; and buffering second data for transfer to the second node by a lower protocol layer lower than the first ULP, the second data including an integral number of standard size units of data including the first data. The method further includes posting the second data to the network for delivery to the second node; receiving the second data at the second node; and from the received data, delivering the first data to a second ULP operating on the second node. The method is of particular application when transferring the data in unit size is faster than transferring the data in other than unit size.