摘要:
A method, apparatus and program product for detecting a communication event in a distributed parallel data processing system in which a message is sent from an origin to a target. A low-level application programming interface (LAPI) is provided which has an operation for associating a counter with a communication event to be detected. The LAPI increments the counter upon the occurrence of the communication event. The number in the counter is monitored, and when the number increases, the event is detected. A completion counter in the origin is associated with the completion of a message being sent from the origin to the target. When the message is completed, LAPI increments the completion counter such that monitoring the completion counter detects the completion of the message. The completion counter may be used to insure that a first message has been sent from the origin to the target and completed before a second message is sent.
摘要:
Method, apparatus and program product for communicating from a node to a communications device. A Hardware Abstraction Layer (HAL) provides functions which can be called from user space in a node to access the communications device. An instance of HAL is created in the node. Device specific characteristics from the communications device and a pointer pointing to HAL functions for accessing the communications device are obtained by HAL. HAL then opens multiple ports on the communications device using the functions pointed to by the pointer, and messages are sent between the node and the communications device. The messages thus sent are optimized with respect to the communications device as determined by the obtained device specific characteristics. Multiple processes and protocol stacks may be associated with each port in a single instance of HAL. A further embodiment provides that multiple virtual ports may be associated with a port, with a multiple protocol stacks associated with each virtual port. A further embodiment provides that multiple communications devices may be associated with a single instance of HAL.
摘要:
A method, apparatus and program product for message communication in a distributed parallel data processing system. A user message is sent from a sender to a receiver. The user message contains user data and a pointer to a header handler routine. The header handler routine includes a first pointer to a target user buffer and a second pointer to a completion routine. When the user message is received, a low level application program interface (LAPI) is informed which invokes the header handler routines which returns the first and second pointers. LAPI then transfers the user data to the user buffer indicated by the header handler routine, and invokes the completion routine indicated by the header handler routine to complete the transfer of the user message to the receiver.
摘要:
In order to solve the problem of the detection of the arrival of duplicate data packets in an interconnected, multinode data processing system, each data packet is provided with a field of r bits that are randomly generated for each data packet. However, one of the packets is provided with a field that is computed from the other randomly generated field entries in a checksum computation which yields a selected nonzero checksum value. A running checksum at the receiver is used to determine whether or not, after the receipt of the specified number, k, of data packets, a duplicate packet has been received.
摘要:
In order to solve the problem of the detection of the arrival of duplicate data packets in an interconnected, multinode data processing system, each data packet is provided with a field of r bits that are randomly generated for each data packet. However, one of the packets is provided with a field that is computed from the other randomly generated field entries in a checksum computation which yields a selected nonzero checksum value. A running checksum at the receiver is used to determine whether or not, after the receipt of the specified number, k, of data packets, a duplicate packet has been received.
摘要:
In order to solve the problem of the detection of the arrival of duplicate data packets in an interconnected, multinode data processing system, each data packet is provided with a field of r bits that are randomly generated for each data packet. However, one of the packets is provided with a field that is computed from the other randomly generated field entries in a checksum computation which yields a selected nonzero checksum value. A running checksum at the receiver is used to determine whether or not, after the receipt of the specified number, k, of data packets, a duplicate packet has been received.
摘要:
Processing within a device is controlled in order to avoid a deadlock situation. A local request engine of the device determines prior to making a request whether the port of the device that is to service the request is making forward progress in processing other requests. If forward progress is being made, then the request is forwarded to the port. Otherwise, the request is held. This avoids a deadlock situation and allows the device to remain operative even in partial recovery situations.
摘要:
Programmable hardware devices are re-programmed without system downtime. To re-program the device, the device is quiesced, state associated with the device is saved, updates are loaded, the state is restored and operations are resumed, all transparent to the system, except for a possible delay in the system.
摘要:
Processing within a device is controlled in order to avoid a deadlock situation. A local request engine of the device determines prior to making a request whether the port of the device that is to service the request is making forward progress in processing other requests. If forward progress is being made, then the request is forwarded to the port. Otherwise, the request is held. This avoids a deadlock situation and allows the device to remain operative even in partial recovery situations.
摘要:
A task obtained by a communications processor is decomposed into one or more requests that form a request group. The requests of the request group are sent to main memory and responses to those requests are expected. There may be requests for a plurality of request groups being processed concurrently. However, responses to the request groups are to be returned to the communications processor in the order in which the request groups were sent from the communications processor. To ensure this ordering, dependencies between the request groups are tracked by hardware coupled to the communications processor.