Abstract:
An example method of diagnosing remote sites of a distributed container orchestration system includes: receiving, at a management cluster, definition of a test suite custom resource; deploying, in response to the test suite custom resource, a first pod in the management cluster; deploying, by the first pod, a second pod in a server of a first remote site of the remote sites; checking, by the second pod, configuration of the server that includes an additional pod executing alongside the second pod, at least one virtual machine (VM) in which the second pod and the additional pod execute, a hypervisor configured to support the at least one VM, and a hardware platform on which the hypervisor executes; and returning test data from the second pod to the first pod, the test data including results of the step of checking the configuration of the server.
Abstract:
The disclosure provides an approach for managing and diagnosing middleboxes in a cloud computing system. In one embodiment, a network operations center, that is located remote to a virtualized cloud computing system and communicates with the cloud computing system via a wide area network, controls network middleboxes in the cloud computing system through a secure routing module inside a gateway of the cloud computing system. The secure routing module is configured to receive, from an authenticated management application and via a secure communication channel, packets intended for managing network middleboxes. In turn, the secure routing module establishes secure communication channels with the target middleboxes, translates the identified packets to protocols and/or application programming interfaces (APIs) of the target middleboxes, and transmits the translated packets to the target middleboxes.
Abstract:
A centralized namespace controller allocates addresses in a distributed cloud infrastructure on-demand. Upon receiving a request to allocate addresses for a network to be provisioned by a cloud computing system included in the distributed cloud infrastructure, the centralized namespace controller allocates a network address that is unique within the distributed cloud infrastructure. Further, the centralized namespace controller allocates a range of virtual network interface cards (NIC) addresses that are unique within the network. The centralized namespace controller then allocates addresses from the range of virtual NIC addresses on an as-requested basis—when a virtual NIC is being created by the first cloud computing system on the network. Advantageously, by centralizing the allocation of addresses and dedicating independent NIC address ranges to different cloud computing systems, the centralized namespace controller enables stretched L2 networks between cloud computing systems while preventing duplicated addresses on the stretched networks.
Abstract:
An example method of optimizing connectivity between data centers in a hybrid cloud system having a first data center managed by a first organization and a second data center managed by a second organization, the first organization being a tenant in the second data center. The method includes probing a wide area network (WAN) with test packets by varying an internet protocol (IP) flow tuple of the test packets across a set of IP flows. The method includes identifying a plurality of paths between a gateway of the first data center and another gateway of the second data center associated with the set of IP flows. The method further includes selecting an IP flow from the set of IP flows for an application executing in the first data center. The method further includes establishing a path-optimized connection between the gateway and the other gateway through the WAN having the selected IP flow for use by the application.
Abstract:
An example method of diagnosing remote sites of a distributed container orchestration system includes: receiving, at a management cluster, definition of a test suite custom resource; detecting, by a test controller agent in a cluster of the remote sites, a diagnosis object in the management cluster created in response to the test suite custom resource; deploying, by the test controller agent in response to the diagnosis object, a first pod in the cluster; deploying, by the first pod, a second pod in a server of a first remote site of the remote sites; checking, by the second pod, configuration of the server that includes an additional pod executing alongside the second pod, at least one virtual machine (VM) in which the second pod and the additional pod execute, a hypervisor configured to support the at least one VM, and a hardware platform on which the hypervisor executes; and returning test data from the second pod to the first pod, the test data including results of the step of checking the configuration of the server.
Abstract:
An example method of propagating fault domain topology information in a distributed container orchestration system includes: receiving, at control plane software executing in a data center, the fault domain topology, which includes tags for a protection group and fault domains for remote sites in communication with the data center; deploying, by a master server of the distributed container orchestration system that executes in the data center, a node pool comprising virtual machines (VMs) executing in servers of the remote sites, the VMs being nodes of the distributed container orchestration system in which containers execute; determining, by a controller of the master server, relationships among the VMs, the servers, the protection group, and the fault domains based on state of resources maintained by the master server; and providing, by the controller, labels to the servers for associating the tags of the protection group and the fault domains to the VMs.
Abstract:
An example method of upgrading remote sites of a distributed container orchestration system includes: deploying, by upgrade software executing in a data center remote from the remote sites, a second container orchestration (CO) control plane executing concurrently with a first CO control plane, the second CO control plane having a second version different than a first version of the first CO control plane, the first CO control plane initially managing all of the remote sites; upgrading, by the upgrade software, CO support software of a first portion of the remote sites; adding, by the upgrade software, the first portion of the remote sites to a second CO cluster managed by the second CO control plane; and removing, by the upgrade software, the first portion of the remote sites from a first CO cluster managed by the first CO control plane.
Abstract:
Techniques leveraging CPU flow affinity to increase throughput of a layer 2 (L2) extension network are disclosed. In one embodiment, an L2 concentrator appliance, which bridges a local area network (LAN) and a wide area network (WAN) in a stretched network, is configured such that multiple Internet Protocol Security (IPsec) tunnels are pinned to respective CPUs or cores, which each process traffic flows for one of the IPsec tunnels. Such parallelism can increase the throughput of the stretched network. Further, an L2 concentrator appliance that receives FOU packets is configured to distribute the received FOU packets across receive queues based a deeper inspection of inner headers of such packets.
Abstract:
Techniques for stateful connection optimization over stretched networks are disclosed. In one embodiment, traffic of virtual machines (VMs) that are live-migrated from a data center to a cloud is temporarily tromboned back to the data center to preserve active sessions. In such a case, a stretched network is created that includes a network in the data center and two stub networks in the cloud, one of which is route optimized such that traffic does not trombone back to the data center and the other which is not so optimized. A VM that is live migrated to the cloud is first attached to the unoptimized network so that traffic tromboning occurs. Thereafter, when the VM is powered off (e.g., during a reboot), in a maintenance mode, or in a quiet period, the VM is switched to the route optimized network.
Abstract:
A cloud computing system may include multiple cloud data centers. A gateway may establish connections between a cloud providers' multiple data centers using knowledge about the types of applications workloads executing within the cloud computing system, and may be further based on determines policies indicating priorities for routing traffic for the application workloads.