Abstract:
System and method for supporting node role attributes in a high performance computing environment. In accordance with an embodiment, a node role attribute can comprise a vendor defined subnet management attribute. When a subnet manager attempts to discover a high performance computing environment, such as an InfiniBand subnet, or a switch topology, identifying a topology is quite complex when subnet manager can only observe connectivity, without context behind the connectivity (the roles of the different nodes in the connectivity). However, when a subnet has a node role attribute enabled, the subnet manager can map the interconnect more effectively as it can discover not only the connectivity during the initial sweep, but it can also discover the role of each node discovered, thus leading to a more efficient interconnect discovery.
Abstract:
Systems and methods are provided for supporting efficient reconfiguration of an interconnection network having a pre-existing routing. An exemplary method can provide a plurality of switches, a plurality of end nodes, and one or more subnet managers, including a master subnet manager. The method can calculate, via the master subnet manager, a first set of one or more leaf-switch to leaf-switch multipaths. The method can store this first set of one or more leaf-switch to leaf-switch multipaths at a metabase. The method can detect a reconfiguration triggering event, and call a new routing for the interconnection network. Finally, the method can reconfigure the network according to the new routing for the interconnection network.
Abstract:
Systems and methods for supporting unique multicast forwarding across multiple connected subnets in a high performance computing environment. In accordance with an embodiment, by enforcing that incoming (i.e., incoming on a router port of a subnet) multicast packets have SGIDs (source global identifiers) that correspond to a restricted set of source subnet numbers when entering the ingress router ports to a local subnet, it is possible to ensure that multicast packets sent from one subnet are never returned to the same subnet through a different set of connected router ports (i.e., avoid looping multicast packets).
Abstract:
System and method for supporting node role attributes in a high performance computing environment. In accordance with an embodiment, a node role attribute can comprise a vendor defined subnet management attribute. When a subnet manager attempts to discover a high performance computing environment, such as an InfiniBand subnet, or a switch topology, identifying a topology is quite complex when subnet manager can only observe connectivity, without context behind the connectivity (the roles of the different nodes in the connectivity). However, when a subnet has a node role attribute enabled, the subnet manager can map the interconnect more effectively as it can discover not only the connectivity during the initial sweep, but it can also discover the role of each node discovered, thus leading to a more efficient interconnect discovery.
Abstract:
Systems and methods to provide default multicast group (MCG) proxy for scalable forwarding of announcements and information request intercepting in a high performance computing environment, in accordance with an embodiment. In accordance with an embodiment, in order to scale the protocols to cover arbitrary number of nodes, a hierarchical scheme can be introduced where the total system is divided into multiple domains where each such domain is represented by an MCG Proxy instance for the relevant protocols.
Abstract:
Systems and methods for supporting dual-port virtual router in a high performance computing environment. In accordance with an embodiment, a dual port router abstraction can provide a simple way for enabling subnet-to-subnet router functionality to be defined based on a switch hardware implementation. A virtual dual-port router can logically be connected outside a corresponding switch port. This virtual dual-port router can provide an InfiniBand specification compliant view to a standard management entity, such as a Subnet Manager. In accordance with an embodiment, a dual-ported router model implies that different subnets can be connected in a way where each subnet fully controls the forwarding of packets as well as address mappings in the ingress path to the subnet.
Abstract:
A system and method for supporting load balancing in a multi-tenant cluster environment, in accordance with an embodiment. One or more tenants can be supported and each associated with a partition, which are each in turn associated with one or more end nodes. The method can provide a plurality of switches, the plurality of switches comprising a plurality of leaf switches and at least one switch at another level, wherein each of the plurality of switches comprise at least one port. The method can assign each node a weight parameter, and based upon this parameter, the method can route the plurality of end nodes within the multi-tenant cluster environment, wherein the routing attempts to preserve partition isolation.
Abstract:
System and method for supporting proxy based multicast forwarding in a high performance computing environment. In accordance with an embodiment, a proxy based multicast forwarding system and method can be utilized. A proxy, either software, firmware, or hardware based, can be initialized and run within a local subnet domain, wherein the proxy is a member of at least one multicast group (MCG). The proxy can be configured to forward packets to other subnet domains in several different methods.
Abstract:
A system and method for supporting load balancing in a multi-tenant cluster environment, in accordance with an embodiment. One or more tenants can be supported and each associated with a partition, which are each in turn associated with one or more end nodes. The method can provide a plurality of switches, the plurality of switches comprising a plurality of leaf switches and at least one switch at another level, wherein each of the plurality of switches comprise at least one port. The method can assign each node a weight parameter, and based upon this parameter, the method can route the plurality of end nodes within the multi-tenant cluster environment, wherein the routing attempts to preserve partition isolation.
Abstract:
A system and method can alleviate congestion in a middleware machine environment with a plurality of switches in a fat-tree topology. The middleware machine environment can support a plurality of end nodes and allows for generating a virtual lane assignment for every pair of source end node and destination end node. Then, the packet flows from a source end node to different destination end nodes sharing a physical link can be distributed across different virtual lanes in order to avoid the head-of-line (HOL) blocking.