Abstract:
A message flow controller limits a process from passing a new message in a reliable message passing layer from a source node to at least one destination node while a total number of in-flight messages for the process meets a first level limit. The message flow controller limits the new message from passing from the source node to a particular destination node from among a plurality of destination nodes while a total number of in-flight messages to the particular destination node meets a second level limit. Responsive to the total number of in-flight messages to the particular destination node not meeting the second level limit, the message flow controller only sends a new packet from among at least one packet for the new message to the particular destination node while a total number of in-flight packets for the new message is less than a third level limit.
Abstract:
A method and data processing system for tracking global shared memory (GSM) operations to and from a local node configured with a host fabric interface (HFI) coupled to a network fabric. During task/job initialization, the system OS assigns HFI window(s) to handle the GSM packet generation and GSM packet receipt and processing for each local task. HFI processing logic automatically tags each GSM packet generated by the HFI window with a global job identifier (ID) of the job to which the local task is affiliated. The job ID is embedded within each GSM packet placed on the network fabric. On receipt of a GSM packet from the network fabric, the HFI logic retrieves the embedded job ID and compares the embedded job ID with the ID within the HFI window(s). GSM packets are forwarded to an HFI window only when the embedded job ID matches the HFI window's job ID.
Abstract:
A method for issuing global shared memory (GSM) operations from an originating task on a first node coupled to a network fabric of a distributed network via a host fabric interface (HFI). The originating task generates a GSM command within an effective address (EA) space. The task then places the GSM command within a send FIFO. The send FIFO is a portion of real memory having real addresses (RA) that are memory mapped to EAs of a globally executing job. The originating task maintains a local EA-to-RA mapping of only a portion of the real address space of the globally executing job. The task enables the HFI to retrieve the GSM command from the send FIFO into an HFI window allocated to the originating task. The HFI window generates a corresponding GSM packet containing GSM operations and/or data, and the HFI window issues the GSM packet to the network fabric.
Abstract:
This present invention provides a portable user space application release/reacquire of adapter resources for a given job on a node using information in a network resource table. The information in the network resource table is obtained when a user space application is loaded by some resource manager. The present invention provides a portable solution that will work for any interconnect where adapter resources need to be freed and reacquired without having to write a specific function in the device driver. In the present invention, the preemption request is done on a job basis using a key or “job key” that was previously loaded when the user space application or job originally requested the adapter resources. This is done for each OS instance where the job is run.
Abstract:
While an asynchronous memory move (AMM) operation is ongoing, a prefetch request for data from the source effective address or the destination effective address triggers cache injection by the AMM mover of relevant data from the stream of data being moved in the physical memory. The memory controller forwards the first prefetched line to the prefetch engine and L1 cache, the next cache lines in the sequence of data to the L2 cache, and a subsequent set of cache lines to the L3 cache. The memory controller then forwards the remaining data to the destination memory location. Quick access to prefetch data is enabled by buffering the stream of data in the upper caches rather than placing all the moved data within the memory. Also, the memory controller places moved data into only a subset of the available cache lines of the upper level cache.
Abstract:
This present invention provides a portable user space application release/reacquire of adapter resources for a given job on a node using information in a network resource table. The information in the network resource table is obtained when a user space application is loaded by some resource manager. The present invention provides a portable solution that will work for any interconnect where adapter resources need to be freed and reacquired without having to write a specific function in the device driver. In the present invention, the preemption request is done on a job basis using a key or “job key” that was previously loaded when the user space application or job originally requested the adapter resources. This is done for each OS instance where the job is run.
Abstract:
A method and a data processing system for completing checkpoint processing of a distributed job with local tasks communicating with other remote tasks via a host fabric interface (HFI) and assigned HFI window. Each HFI window has a send count and a receive count, which tracks GSM messages that are sent from and received at the HFI window. When a checkpoint is initiated by a master task, each local task forwards the send count and the receive count to the master task. The master task sums the respective counts and then compares the totals to each other. When the send count total is equal to the receive count total, the tasks are permitted to continue processing. However, when the send count total is not equal to the receive count total, the master task notifies each task of the job to rollback to a previous checkpoint or kill the job execution.
Abstract:
A host fabric interface (HFI) enables debugging of global shared memory (GSM) operations received at a local node from a network fabric. The local node has a memory management unit (MMU), which provides an effective address to real address (EA-to-RA) translation table that is utilized by the HFI to evaluate when EAs of GSM operations/data from a received GSM packet is memory-mapped to RAs of the local memory. The HFI retrieves the EA associated with a GSM operation/data within a received GSM packet. The HFI forwards the EA to the MMU, which determines when the EA is mapped to RAs within the local memory for the local task. The HFI processing logic enables processing of the GSM packet only when the EA of the GSM operation/data within the GSM packet is an EA that has a local RA translation. Non-matching EAs result in an error condition that requires debugging.
Abstract:
A method is provided for transferring data between first and second nodes of a network. Such method includes requesting first data to be transferred by a first upper layer protocol (ULP) operating on the first node of the network; and buffering second data for transfer to the second node by a lower protocol layer lower than the first ULP, the second data including an integral number of standard size units of data including the first data. The method further includes posting the second data to the network for delivery to the second node; receiving the second data at the second node; and from the received data, delivering the first data to a second ULP operating on the second node. The method is of particular application when transferring the data in unit size is faster than transferring the data in other than unit size.
Abstract:
A system and method to provide injection of important data directly into a processor's cache location when that processor has previously indicated interest in the data. The memory subsystem at a target processor will determine if the memory address of data to be written to a memory location associated with the target processor is found in a processor cache of the target processor. If it is determined that the memory address is found in a target processor's cache, the data will be directly written to that cache at the same time that the data is being provided to a location in main memory.