Abstract:
Systems and methods for providing multicast group multicast local identifier (MLID) dynamic discovery on received multicast messages for a relevant multicast global identifier (MGID) in a high performance computing environment. By allowing InfiniBand (IB) clients to associate local queue pairs (QPs) with the MGID(s) of relevant multicast group(s) without requiring any join request to the subnet manager (SM)/subnet administration (SA), it is possible to receive relevant multicast (MC) messages without imposing the SM/SA overhead of a conventional multicast group join request. After receiving, at an end-node of the subnet, a multicast packet including a multicast global identifier and a multicast local identifier, the end-node can inspect the multicast packet to learn the multicast local identifier and include the learned multicast local identifier in the multicast group record at the end-node for the received multicast global identifier.
Abstract:
Systems and methods for providing multicast group (MCG) membership relative to partition membership in a high performance computing environment. In accordance with an embodiment, by allowing a subnet manager of a local subnet to be instructed that all ports that are members of the relevant partition should be set up as members for a specific multicast group, the SM can perform a more efficient multicast-routing process. It is also possible to limit the IB client interaction with subnet administration conventionally required to handle join and leave operations. Additionally, subnet manager overhead can be reduced by creating a spanning tree for the routing of multicast packets that includes each of the partition members added to the multicast group, instead of creating a spanning tree after each multicast group join request is received, as conventionally required.
Abstract:
System and method for supporting shared multicast local identifiers (MLIDs) a high performance computing environment. In accordance with an embodiment, a shared MLID range can be configured such that each subnet within a fabric can utilize an MLID within a shared MLID range without the need to utilize a TCAM, or other memory, lookup of a MGID to MLID mapping.
Abstract:
Systems and methods are provided for supporting efficient reconfiguration of an interconnection network having a pre-existing routing. An exemplary method can provide a plurality of switches, a plurality of end nodes, and one or more subnet managers, including a master subnet manager. The method can calculate, via the master subnet manager, a first set of one or more leaf-switch to leaf-switch multipaths. The method can store this first set of one or more leaf-switch to leaf-switch multipaths at a metabase. The method can detect a reconfiguration triggering event, and call a new routing for the interconnection network. Finally, the method can reconfigure the network according to the new routing for the interconnection network.
Abstract:
A system and method can rout traffic between distinct subnets in a network environment. A router that connects the distinct subnets, such as InfiniBand (IB) subnets, can receive a list of destinations that the router is responsible for routing one or more packets to. Furthermore, the router can obtain information, from one or more switches in the at least one subnet, on which downward output ports of the router can be used for routing the one or more packets, and build a routing table based on the obtained information.
Abstract:
A system and method can support multi-homed routing in a network environment, which can be based on InfiniBand architecture using a fat-tree or a similar topology. The system can provide an end node that is associated with a switch port on a leaf switch in a network fabric. Then, the system can perform routing for each of a plurality of ports on the end node, and ensure that the plurality of ports on the end node take mutually independent paths.
Abstract:
A system and method can support discovering and routing in a fabric with a plurality of switches. The system allows one or more switches in the fabric to be tagged with a switch role. Then, a subnet manager in the fabric detect the switch role that is associated with the one or more switches. Furthermore, a routing algorithm can be applied on the fabric based on the detected switch role associated with the one or more switches.