Abstract:
Systems and methods for supporting coordinated link up handling following a switch reset in a high performance computing environment. Systems and methods can ensure that when a switch of a fabric is rebooted, HCA ports connected to that switch will be set in Active state at the same time even though link training times for different ports may vary with up to several seconds.
Abstract:
System and method for supporting scalable representation of switch port status in a high performance computing environment. In accordance with an embodiment, a scalable representation of switch port status can be provided. By adding a scalable representation of switch port status at each switch (both physical and virtual)—instead of getting all switch port changes individually, the scalable representation of switch port status can combine a number of ports that can scale by just using a few bits of information for each port's status.
Abstract:
System and method for supporting a flexible framework for extendable SMA attributes in a high performance computing environment. In accordance with an embodiment, an information attribute can provide for enhancements in a number of areas. For example, in addition to indicating which version of an interface a queried node supports, the information attribute can additionally provide a mask indicating which vendor specific SMA attributes the node supports. In this way, a subnet manager can identify a version of an interface at each node in a subnet, as well as each node's SMA attribute capabilities. In turn, this allows nodes to run different versions of an interface within a same subnet, without introducing confusion.
Abstract:
Systems and methods for providing explicit multicast local identifier assignment for per-partition default multicast local identifiers defined as subnet manager policy input in a high performance computing environment. In accordance with an embodiment, an explicit multicast local identifier (MLID) assignment policy can be provided (as, e.g., administrative input) that explicitly defines which MLIDs will be used for which partitions in a subnet. Further, an MLID assignment policy can also define which dedicated MLIDs will be associated with given multicast group idnetifiers (for example, partition independent MLIDs). By employing such an MLID assignment policy, a new or restarted master subnet manger can observe and verify the MLIDs used for existing partitions, instead of generating new MGID to MLID mappings. In this way, changes in MLID associations for any corresponding MGID can be avoided as a result of master SM restarts or failovers, or any subnet-merge operations.
Abstract:
Systems and methods for providing dual multicast local identifiers (MLIDs) per multicast group to facilitate both full and limited partition members in a high performance computing environment. In accordance with an embodiment, in order to avoid the need for the above special handling of P_Key access violations, as well as to ensure complete isolation between limited partition members in terms of multicast traffic, two MLIDs can be allocated to a single MCG, in accordance with an embodiment. A first MLID can be allocated and used by end-ports for sending from full partition members to both full and limited partition members. Additionally, a second MLID can be allocated and used by end-ports for sending from limited partition members to full partition members. Using this scheme, a limited partition member can avoid sending multicast packets to other limited partition members in the MCG.
Abstract:
A system and method can support subnet management in a network environment, such as an engineered system for middleware and application execution or a middleware machine environment. A subnet manager (SM) can retrieve information for setting up a reliable connection (RC) between a subnet administrator (SA) and a client node in a subnet. Furthermore, the system can set up one or more connection states for a port associated with the SM node to establish the RC connection between the port associated with the SM node and a port associated with said client node. Then, the SM can activate the port associated with said client node.
Abstract:
Systems and methods for using queue pair 1 (QP1) for receiving multicast based announcements in multiple partitions in a high performance computing. In accordance with an embodiment, by extending the scope of QP1 to also include receiving and sending multicast packets in any partition defined for the port, it is possible to implement generic MC based announcement and discovery without requiring the complexity of unique QPs for individual partitions, nor any update of QP configuration as a consequence of change of partition membership.
Abstract:
Systems and methods for path record handling in a fabric without host stack cooperation in a high performance computing environment. In a case where the subnet manager has determined “homogenous subnet/fabric” or “semi-homogenous subnet/fabric” status for the current topology, but is still receiving path queries, the subnet manager can use the relevant status to avoid any route evaluation and generate path record either only based on the configuration status of the requesting port in the homogenous case, or by comparing the configuration status of both ports in the semi-homogenous case.
Abstract:
Systems and methods for supporting dual-port virtual router in a high performance computing environment. In accordance with an embodiment, a dual port router abstraction can provide a simple way for enabling subnet-to-subnet router functionality to be defined based on a switch hardware implementation. A virtual dual-port router can logically be connected outside a corresponding switch port. This virtual dual-port router can provide an InfiniBand specification compliant view to a standard management entity, such as a Subnet Manager. In accordance with an embodiment, a dual-ported router model implies that different subnets can be connected in a way where each subnet fully controls the forwarding of packets as well as address mappings in the ingress path to the subnet.
Abstract:
Systems and methods for InfiniBand fabric optimizations to minimize SA access and startup failover times. A system can comprise one or more microprocessors, a first subnet, the first subnet comprising a plurality of switches, a plurality of host channel adapters, a plurality of hosts, and a subnet manager, the subnet manager running on one of the one or more switches and the plurality of host channel adapters. The subnet manager can be configured to determine that the plurality of hosts and the plurality of switches support a same set of capabilities. On such determination, the subnet manager can configure an SMA flag, the flag indicating that a condition can be set for each of the host channel adapter ports.