Abstract:
A system and method for supporting subnet management in a network environment is described. The system and method can be used in an engineered system for middleware and application execution, or a middleware machine environment. The system can associate a subnet administrator (SA) in a subnet with a plurality of SA proxies, each of which can receive plurality of requests from one or more client nodes. The SA can handle the requests, which are forwarded from the SA proxies. Additionally, each client node can be assigned a dedicated queue pair (QP) number, so that there is no need for always sending an initial request to a pre-defined well-known QP number.
Abstract:
A system and method can implement highly available Internet Protocol (IP) based communication across multiple independent communication paths. The system can have different IP addresses associated with different interfaces and communication paths and can implement communication fail-over as part of the communication layers above the IP layer, e.g. at the application level. The system can provide a balance between an average fail-over time and implementation complexity, and can achieve simplicity and robustness while providing high communication performance.
Abstract:
A system and method can support subnet management in a network environment, such as an engineered system for middleware and application execution or a middleware machine environment. The system can associate a subnet administrator (SA) in a subnet with one or more SA proxies. Furthermore, said one or more SA proxies can receive one or more requests from one or more client nodes. Then, said SA can handle said one or more requests, which are forwarded from said one or more SA proxies. Additionally, a dedicated queue pair (QP) number can be allocated for each client node, so that there is no need for always sending an initial request to a pre-defined well-known QP number.
Abstract:
Systems and methods for InfiniBand fabric optimizations to minimize SA access and startup failover times. A system can comprise one or more microprocessors, a first subnet, the first subnet comprising a plurality of switches, a plurality of host channel adapters, a plurality of hosts, and a subnet manager, the subnet manager running on one of the one or more switches and the plurality of host channel adapters. The subnet manager can be configured to determine that the plurality of hosts and the plurality of switches support a same set of capabilities. On such determination, the subnet manager can configure an SMA flag, the flag indicating that a condition can be set for each of the host channel adapter ports.
Abstract:
Systems and methods for providing explicit multicast local identifier assignment for per-partition default multicast local identifiers defined as subnet manager policy input in a high performance computing environment. In accordance with an embodiment, an explicit multicast local identifier (MLID) assignment policy can be provided (as, e.g., administrative input) that explicitly defines which MLIDs will be used for which partitions in a subnet. Further, an MLID assignment policy can also define which dedicated MLIDs will be associated with given multicast group identifiers (for example, partition independent MLIDs). By employing such an MLID assignment policy, a new or restarted master subnet manger can observe and verify the MLIDs used for existing partitions, instead of generating new MGID to MLID mappings. In this way, changes in MLID associations for any corresponding MGID can be avoided as a result of master SM restarts or failovers, or any subnet-merge operations.
Abstract:
Systems and methods for supporting dual-port virtual router in a high performance computing environment. In accordance with an embodiment, a dual port router abstraction can provide a simple way for enabling subnet-to-subnet router functionality to be defined based on a switch hardware implementation. A virtual dual-port router can logically be connected outside a corresponding switch port. This virtual dual-port router can provide an InfiniBand specification compliant view to a standard management entity, such as a Subnet Manager. In accordance with an embodiment, a dual-ported router model implies that different subnets can be connected in a way where each subnet fully controls the forwarding of packets as well as address mappings in the ingress path to the subnet.
Abstract:
Systems and methods to use all incoming multicast (MC) packets as a basis for global unique identifier (GUID) to local identifier (LID) cache contents in a high performance computing environment, in accordance with an embodiment. Since all multicast packets have a Global Route Header (GRH), there is always both a source GID and a source LID defined for an incoming multicast packet. This implies that it is, in general, possible for an HCA implementation to gather information about GID and GUID to LID mappings for any sender node based on all incoming MC packets.
Abstract:
System and method for supporting scalable representation of switch port status in a high performance computing environment. In accordance with an embodiment, a scalable representation of switch port status can be provided. By adding a scalable representation of switch port status at each switch (both physical and virtual)—instead of getting all switch port changes individually, the scalable representation of switch port status can combine a number of ports that can scale by just using a few bits of information for each port's status.
Abstract:
Systems and methods for providing multicast group multicast local identifier (MLID) dynamic discovery on received multicast messages for a relevant multicast global identifier (MGID) in a high performance computing environment. By allowing InfiniBand (IB) clients to associate local queue pairs (QPs) with the MGID(s) of relevant multicast group(s) without requiring any join request to the subnet manager (SM)/subnet administration (SA), it is possible to receive relevant multicast (MC) messages without imposing the SM/SA overhead of a conventional multicast group join request. After receiving, at an end-node of the subnet, a multicast packet including a multicast global identifier and a multicast local identifier, the end-node can inspect the multicast packet to learn the multicast local identifier and include the learned multicast local identifier in the multicast group record at the end-node for the received multicast global identifier.
Abstract:
System and method for supporting scalable representation of link stability and availability in a high performance computing environment. A method can provide at attribute at each node in a subnet, wherein the attribute provides a single location at each node for a subnet manager to query the stability and availability of each link connected to the queried node. The attribute can be populated and maintained by a subnet management agent residing at the node.