Abstract:
The disclosure provides an approach for preventing the failure of virtual computing instance transfers across data centers. In one embodiment, a flow control module collects performance information primarily from components in a local site, as opposed to components in a remote site, during the transfer of a virtual machine (VM) from the local site to the remote site. The performance information that is collected may include various performance metrics, each of which is considered a feature. The flow control module performs feature preparation by normalizing feature data and imputing missing feature data, if any. The flow control module then inputs the prepared feature data into machine learning model(s) which have been trained to predict whether a VM transfer will succeed or fail, given the input feature data. If the prediction is that the VM transfer will fail, then remediation actions may be taken, such as slowing down the VM transfer.
Abstract:
The disclosure herein describes an edge device of a network for distributed policy enforcement. During operation, the edge device receives an initial packet for an outgoing traffic flow, and identifies a policy being triggered by the initial packet. The edge device performs a reverse lookup to identify at least an intermediate node that is previously traversed by the initial packet and traffic parameters associated with the initial packet at the identified intermediate node. The edge device translates the policy based on the traffic parameters at the intermediate node, and forwards the translated policy to the intermediate node, thus facilitating the intermediate node in applying the policy to the traffic flow.
Abstract:
Systems and techniques are described for monitoring network communications using a distributed firewall. One of the techniques includes receiving, at a driver executing in a guest operating system of a virtual machine, a request to open a network connection from a process associated with a user, wherein the driver performs operations comprising: obtaining identity information for the user; providing the identity information and data identifying the network connection to an identity module external to the driver; and receiving, by a distributed firewall, data associating the identity information with the data identifying the network connection from the identity module, wherein the distributed firewall performs operations comprising: receiving an outgoing packet from the virtual machine; determining that the identity information corresponds to the outgoing packet; and evaluating one or more routing rules based at least in part on the identity information.
Abstract:
The disclosure provides a method for diagnosing remote sites of a distributed container orchestration system. The method generally includes receiving a test suite custom resource defining an image to be used for a diagnosis of components of a workload cluster deployed at the remote sites, wherein the image comprises a diagnosis module and/or a user-provided plugin to be used for the diagnosis; identifying a failed component in the workload cluster; obtaining infrastructure information about the workload cluster; identifying the components of the workload cluster for diagnosis based on the failed component, the infrastructure information, and the test suite custom resource; identifying at least one diagnosis site of the remote sites where the components are running using the infrastructure information; and deploying a first pod at the at least one diagnosis site to execute the diagnosis of the one or more components.
Abstract:
An example method of propagating fault domain topology information in a distributed container orchestration system includes: receiving, at control plane software executing in a data center, the fault domain topology, which includes tags for a protection group and fault domains for remote sites in communication with the data center; deploying, by a master server of the distributed container orchestration system that executes in the data center, a node pool comprising virtual machines (VMs) executing in servers of the remote sites, the VMs being nodes of the distributed container orchestration system in which containers execute; determining, by a controller of the master server, relationships among the VMs, the servers, the protection group, and the fault domains based on state of resources maintained by the master server; and providing, by the controller, labels to the servers for associating the tags of the protection group and the fault domains to the VMs.
Abstract:
The disclosure provides an approach for diagnosing a data plane of a network, wherein the network spans a first data center and a second data center, and wherein the second data center is remote to the first, the method comprising: accessing a secure connection between the first data center and the second data center; modifying, by the first performance controller, firewall settings of the first data center from a first setting to a second setting; opening on the second data center an instance of a performance tool; opening on the first data center a client of the instance of the performance tool; sending data packets over the data plane of the network; receiving the data packets; generating metrics associated with the data packets; and modifying firewall settings of the first data center from the second setting to the first setting.
Abstract:
A method of transferring data between local and remote computing systems includes the step of transferring data between the local and remote computing systems via a local buffer in the local computing system and a series of steps carried out during transferring of data from the local to the remote computing system. The steps include receiving a statistic from the remote computing system, computing an average transfer rate of the data transfer between the local and remote computing systems based on the statistic, determining whether or not a throttle condition is in effect based on the computed average transfer rate, and upon determining that the throttle condition is in effect, throttling the transferring of data into the local buffer.
Abstract:
Techniques for stateful connection optimization over stretched networks are disclosed. Such stretched networks may extend across both a data center and a cloud. In one embodiment, configuration changes are made to cloud layer 2 (L2) concentrators used by extended networks and a cloud router such that the L2 concentrators block packets with the cloud router's source MAC address and block address resolution protocol (ARP) requests for a gateway IP address from/to cloud networks that are part of the extended networks. Further, the cloud router is configured with the same gateway IP address as that of a default gateway router in the data center and responds to ARP requests for the gateway IP address with its own MAC address. In addition, specific prefix routes (e.g., /32 routes) for virtual computing instances on route optimized networks in the cloud are injected into the cloud router and propagating to a data center router.
Abstract:
The disclosure herein describes an edge device of a network for distributed policy enforcement. During operation, the edge device receives an initial packet for an outgoing traffic flow, and identifies a policy being triggered by the initial packet. The edge device performs a reverse lookup to identify at least an intermediate node that is previously traversed by the initial packet and traffic parameters associated with the initial packet at the identified intermediate node. The edge device translates the policy based on the traffic parameters at the intermediate node, and forwards the translated policy to the intermediate node, thus facilitating the intermediate node in applying the policy to the traffic flow.
Abstract:
A computer-implemented method, medium, and system for implementing a pluggable diagnostic tool for Telco radio access network (RAN) troubleshooting are disclosed. In one computer-implemented method, one or more containerized network function (CNF) instances are generated in a container orchestration platform by a test system and by using a telecommunication cloud automation (TCA) platform executed in the container orchestration platform, where the test system is onboarded to the TCA platform, and the one or more CNF instances are associated with 5G RAN. A customer resources (CR) file is received by the test system, where the CR file defines multiple test cases associated with validation of the TCA platform. The CR file is transmitted to a cluster of nodes in the container orchestration platform. The validation of the TCA platform is executed at the cluster of nodes based on the one or more CNF instances and the CR file.