摘要:
Compiling software for a hierarchical distributed processing system including providing to one or more compiling nodes software to be compiled, wherein at least a portion of the software to be compiled is to be executed by one or more other nodes; compiling, by the compiling node, the software; maintaining, by the compiling node, any compiled software to be executed on the compiling node; selecting, by the compiling node, one or more nodes in a next tier of the hierarchy of the distributed processing system in dependence upon whether any compiled software is for the selected node or the selected node's descendants; sending to the selected node only the compiled software to be executed by the selected node or selected node's descendant.
摘要:
Methods, apparatus, and products are disclosed for reducing power consumption while performing collective operations on a plurality of compute nodes that include: receiving, by each compute node, instructions to perform a type of collective operation; selecting, by each compute node from a plurality of collective operations for the collective operation type, a particular collective operation in dependence upon power consumption characteristics for each of the plurality of collective operations; and executing, by each compute node, the selected collective operation.
摘要:
Methods, apparatus, and products are disclosed for controlling data transfers from an origin compute node to a target compute node that include: receiving, by an application messaging module on the target compute node, an indication of a data transfer from an origin compute node to the target compute node; and administering, by the application messaging module on the target compute node, the data transfer using one or more messaging primitives of a system messaging module in dependence upon the indication.
摘要:
Methods, apparatus, and products are disclosed for pacing network traffic among a plurality of compute nodes connected using a data communications network. The network has a plurality of network regions, and the plurality of compute nodes are distributed among these network regions. Pacing network traffic among a plurality of compute nodes connected using a data communications network includes: identifying, by a compute node for each region of the network, a roundtrip time delay for communicating with at least one of the compute nodes in that region; determining, by the compute node for each region, a pacing algorithm for that region in dependence upon the roundtrip time delay for that region; and transmitting, by the compute node, network packets to at least one of the compute nodes in at least one of the network regions in dependence upon the pacing algorithm for that region.
摘要:
Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
摘要:
Data communications, including issuing, by an application program to a high level data communications library, a request for initialization of a data communications service; issuing to a low level data communications library a request for registration of data communications functions; registering the data communications functions, including instantiating a factory object for each of the one or more data communications functions; issuing by the application program an instruction to execute a designated data communications function; issuing, to the low level data communications library, an instruction to execute the designated data communications function, including passing to the low level data communications library a call parameter that identifies a factory object; creating with the identified factory object the data communications object that implements the data communications function according to the protocol; and executing by the low level data communications library the designated data communications function.
摘要:
Methods, apparatus, and products are disclosed for providing policy-based operating system services in an operating system on a computing system. The computing system includes at least one compute node. The compute node includes an operating system that includes a kernel and a plurality of operating system services of a service type. Providing policy-based operating system services in an operating system on a computing system includes establishing, on the compute node, a kernel policy specifying one of the operating system services of the service type for use in the operating system, and accessing, by the kernel, the specified operating system service. The computing system may also be implemented as a distributed computing system that includes one or more operating system service nodes. One or more of the operating system services may be distributed among the operating system service nodes.
摘要:
The present invention provides a system and method for extracting elements from distributed arrays on a parallel processing system. The system includes a module that populates a result array with globally largest elements from input arrays, a module that generates a partition element, a module that counts the number of local elements greater than the partition element, and a module that determines the globally largest elements. The method for extracting elements from distributed arrays on a parallel processing system includes populating a result array with globally largest elements from input arrays, generating a partition element, counting the number of local elements greater than the partition and determining the globally largest elements.
摘要:
Methods, compute nodes, and computer program products are provided for heuristic status polling of a component in a computing system. Embodiments include receiving, by a polling module from a requesting application, a status request requesting status of a component; determining, by the polling module, whether an activity history for the component satisfies heuristic polling criteria; polling, by the polling module, the component for status if the activity history for the component satisfies the heuristic polling criteria; and not polling, by the polling module, the component for status if the activity history for the component does not satisfy the heuristic criteria.
摘要:
Methods, compute nodes, and computer program products are provided for heuristic status polling of a component in a computing system. Embodiments include receiving, by a polling module from a requesting application, a status request requesting status of a component; determining, by the polling module, whether an activity history for the component satisfies heuristic polling criteria; polling, by the polling module, the component for status if the activity history for the component satisfies the heuristic polling criteria; and not polling, by the polling module, the component for status if the activity history for the component does not satisfy the heuristic criteria.