摘要:
Shared locks are employed for controlling a thread which extends across more than one protocol layer in a data processing system. The use of a counter is used as part of a data structure which makes it possible to implement shared locks across multiple layers. The use of shared locks avoids the processing overhead usually associated with lock acquisition and release. The thread which is controlled may be initiated in either an upper layer protocol or in a lower layer.
摘要:
Messages arriving at a receiver are managed to ensure proper ordering of the messages. To facilitate proper ordering, a message sequence number is used, as well as matching criteria to match a correctly sequenced message with a posted receive. In response to processing a message, a check is made as to whether previously out of order messages can now be processed.
摘要:
A method for transparently handling messages originating from local shared memory and from an external source. A device driver allows the local sender to identify and wake up a waiting receiver task thread, simulating a packet arrival hardware interrupt. Upon awakening, the receiver task thread examines both shared memory and hardware message queues. The method can use a software routine that simulates handling of an occurrence of a hardware interrupt. The method invokes a local notify system service module that passes a window number identifying a receiving task. The method invokes a wake thread module that passes awakens a thread associated with the window number, and examines the shared memory buffer for receipt of the local source message. The method then copies the local source message from the shared memory buffer to the receiving task.
摘要:
A method for implementing Message Passing Interface (MPI-2) one-sided communication by using Low-level Applications Programming Interface (LAPI) active messaging capabilities, including providing at least three data transfer types, one of which is used to send a message with a message header greater than one packet where Data Gather and Scatter Programs (DGSP) are placed as part of the message header; allowing a multi-packet header by using a LAPI data transfer type; sending the DGSP and data as one message; reading the DSGP with a header handler; registering the DSGP with the LAPI to allow the LAPI to scatter the data to one or more memory locations; defining two sets of counters, one counter set for keeping track of a state of a prospective communication partner, and another counter set for recording activities of local and Remote Memory Access (RMA) operations; comparing local and remote counts of completed RMA operations to complete synchronization mechanisms; and creating a mpci_wait_loop function.
摘要:
A checkpoint of a parallel program is taken in order to provide a consistent state of the program in the event the program is to be restarted. Each process of the parallel program is responsible for taking its own checkpoint, however, the timing of when the checkpoint is to be taken by each process is the responsibility of a coordinating process. During the checkpointing, various data is written to a checkpoint file. This data includes, for instance, in-transit message data, a data section, file offsets, signal state, executable information, stack contents and register contents. The checkpoint file can be stored either in local or global storage. When it is stored in global storage, migration of the program is facilitated. When a parallel program is to be restarted, each process of the program initiates its own restart. The restart logic restores the process to the state at which the checkpoint was taken.
摘要:
A checkpoint of a parallel program is taken in order to provide a consistent state of the program in the event the program is to be restarted. Each process of the parallel program is responsible for taking its own checkpoint, however, the timing of when the checkpoint is to be taken by each process is the responsibility of a coordinating process. During the checkpointing, various data is written to a checkpoint file. This data includes, for instance, in-transit message data, a data section, file offsets, signal state, executable information, stack contents and register contents. The checkpoint file can be stored either in local or global storage. When it is stored in global storage, migration of the program is facilitated. When a parallel program is to be restarted, each process of the program initiates its own restart. The restart logic restores the process to the state at which the checkpoint was taken.
摘要:
A method for implementing Message Passing Interface (MPI-2) one-sided communication by using Low-level Applications Programming Interface (LAPI) active messaging capabilities, including providing at least three data transfer types, one of which is used to send a message with a message header greater than one packet where Data Gather and Scatter Programs (DGSP) are placed as part of the message header; allowing a multi-packet header by using a LAPI data transfer type; sending the DGSP and data as one message; reading the DSGP with a header handler; registering the DSGP with the LAPI to allow the LAPI to scatter the data to one or more memory locations; defining two sets of counters, one counter set for keeping track of a state of a prospective communication partner, and another counter set for recording activities of local and Remote Memory Access (RMA) operations; comparing local and remote counts of completed RMA operations to complete synchronization mechanisms; and creating a mpci_wait_loop function.
摘要:
Shared locks are employed for controlling a thread which extends across more than one protocol layer in a data processing system. The use of a counter is used as part of a data structure which makes it possible to implement shared locks across multiple layers. The use of shared locks avoids the processing overhead usually associated with lock acquisition and release. The thread which is controlled may be initiated in either an upper layer protocol or in a lower layer.
摘要:
Messages arriving at a receiver are managed to ensure proper ordering of the messages. To facilitate proper ordering, a message sequence number is used, as well as matching criteria to match a correctly sequenced message with a posted receive. In response to processing a message, a check is made as to whether previously out of order messages can now be processed.
摘要:
Data is written to an unsegmented buffer located within shared memory. While data is being written to the unsegmented buffer, at least a portion of the data is being read from the buffer. A counter is used to indicate how much space is available in the buffer to receive data. Further, the counter is employed to ensure that the reader does not advance beyond the writer.