Abstract:
Systems and methods provide for optimizing utilization of an Address Translation Cache (ATC). A network interface controller (NIC) can write information reserving one or more cache lines in a first level of the ATC to a second level of the ATC. The NIC can receive a request for a direct memory access (DMA) to an untranslated address in memory of a host computing system. The NIC can determine that the untranslated address is not cached in the first level of the ATC. The NIC can identify a selected cache line in the first level of the ATC to evict using the request and the second level of the ATC. The NIC can receive a translated address for the untranslated address. The NIC can cache the untranslated address in the selected cache line. The NIC can perform the DMA using the translated address.
Abstract:
A method is provided by which a network adapter device receives a packet sent over a network from a peer, the packet including an enqueue timestamp indicating when the packet has been enqueued at the network adapter device. The network adapter device parses a header of the packet to detect whether the header includes bits indicating that the peer device is experiencing congestion, and obtains packet metadata of the packet and the enqueue timestamp of the packet. The network adapter device compares the packet metadata with information in a flow table to identify an entry in the flow table corresponding to a flow to which the packet metadata matches. The network adapter device sets a timer associated with the flow, the timer for use in scheduling transmission of a next packet provided by the host to be sent to the peer.
Abstract:
An example method for transformation of Peripheral Component Interconnect Express (PCIe) compliant virtual devices in a server in a network environment is provided and includes receiving, during runtime of the server, a request to change a first configuration of a PCIe compliant virtual device to a different second configuration, identifying a bridge on a PCIe topology below which the virtual device is located, issuing a simulated secondary bus reset to the bridge, the virtual device being reconfigured according to the change in configuration after the simulated secondary bus reset is issued, re-enumerating below the bridge after the change in configuration completes without rebooting the server, and updating the PCI topology with the virtual device in the second configuration. A virtual interface card adapter traps the simulated secondary bus reset, removes the virtual device from the PCI topology, and reconfigures the virtual device from the first configuration to the second configuration.
Abstract:
An example method for facilitating policy-driven storage in a microserver computing environment is provided and includes receiving, at an input/output (I/O) adapter in a microserver chassis having a plurality of compute nodes and a shared storage resource, policy contexts prescribing storage access parameters of respective compute nodes and enforcing the respective policy contexts on I/O operations by the compute nodes, in which respect a particular I/O operation by any compute node is not executed if the respective policy context does not allow the particular I/O operation. The method further includes allocating tokens to command descriptors associated with I/O operations for accessing the shared storage resource, identifying a violation of any policy context of any compute node based on availability of the tokens, and throttling I/O operations by other compute nodes until the violation disappears.
Abstract:
A method is provided in one example embodiment and includes receiving by a network element a request from a network device connected to the network element to update a shared resource maintained by the network element; subsequent to the receipt, identifying a Base Address Register Resource Table (“BRT”) element assigned to a Peripheral Component Interconnect (“PCI”) adapter of the network element associated with the network device, wherein the BRT points to the shared resource; changing an attribute of the identified BRT from read-only to read/write to enable the identified BRT to be written by the network device; and notifying the network device that the attribute of the identified BRT has been changed, thereby enabling the network device to update the shared resource via a Base Address Register (“BAR”) comprising the identified BRT.
Abstract:
An example method for adaptively coalescing remote direct memory access (RDMA) acknowledgements is provided. The method includes determining one or more input/output (I/O) characteristics of RDMA packets of a plurality of queue pairs (QPs) on a per-QP basis, each QP identifying a respective RDMA connection between a respective first compute node and a respective second compute node. The method further includes determining an acknowledgement frequency for providing acknowledgements of the RDMA packets on a per-QP basis (i.e., a respective acknowledgement frequency is set for each QP) based on the determined one or more I/O characteristics for each QP.
Abstract:
An example method for flexible remote direct memory access resource configuration in a network environment is provided and includes determining whether sufficient remote direct memory access (RDMA) resources are available in a network environment to satisfy a request for RDMA resources, inserting the requested RDMA resources into a network profile, associating the network profile with a network interface endpoint in the network, and communicating the network profile over the network to a virtual interface card (VIC) adapter that processes RDMA packets, the VIC adapter configuring the requested RDMA resources in the VIC adapter's hardware and the requested RDMA resources being mapped to a host memory for use by the network interface endpoint. In specific embodiments, the VIC adapter allocates and identifies a region in local memory for managing the requested RDMA resources and reserved for the network interface endpoint.
Abstract:
Techniques for sending Compute Express Link (CXL) packets over Ethernet (CXL-E) in a composable data center that may include disaggregated, composable servers. The techniques may include receiving, from a first server device, a request to bind the first server device with a multiple logical device (MLD) appliance. Based at least in part on the request, a first CXL-E connection may be established for the first server device to export a computing resource to the MLD appliance. The techniques may also include receiving, from the MLD appliance, an indication that the computing resource is available, and receiving, from a second server device, a second request for the computing resource. Based at least in part on the second request, a second CXL-E connection may be established for the second server device to consume or otherwise utilize the computing resource of the first server device via the MLD appliance.
Abstract:
Systems and methods provide for optimizing utilization of an Address Translation Cache (ATC). A network interface controller (NIC) can write information reserving one or more cache lines in a first level of the ATC to a second level of the ATC. The NIC can receive a request for a direct memory access (DMA) to an untranslated address in memory of a host computing system. The NIC can determine that the untranslated address is not cached in the first level of the ATC. The NIC can identify a selected cache line in the first level of the ATC to evict using the request and the second level of the ATC. The NIC can receive a translated address for the untranslated address. The NIC can cache the untranslated address in the selected cache line. The NIC can perform the DMA using the translated address.
Abstract:
In one example, at least one peripheral interconnect switch obtains, from a first endpoint device, a message initiating a direct memory access data transfer between the first endpoint device and a second endpoint device. The message indicates an address assigned to the second endpoint device by a host device as a destination of the message. Based on the address assigned to the second endpoint device by the host device, the at least one peripheral interconnect switch identifies an address assigned to the second endpoint device by the at least one peripheral interconnect switch. In response to identifying the address assigned to the second endpoint device by the at least one peripheral interconnect switch, the at least one peripheral interconnect switch provides the message to the second endpoint device.