Abstract:
Presented herein are techniques enable existing hardware input/output resources, such as the hardware queues (queue control registers), of a network interface card to be shared with different hosts (i.e., each queue mapped to many hosts) by logically segregating the hardware I/O resources using assignable interfaces each associated with a distinct Process Address Space Identifier (PASID). That is, different assignable interfaces are created and associated with different PASIDs, and these assignable interfaces each correspond to a different host (i.e., there is a mapping between a host, an assignable interface, a PASID, and a partition of a hardware queue). The result is that that the hosts can use the assignable interface to directly access the hardware queue partition that corresponds thereto.
Abstract:
An example method for facilitating low latency remote direct memory access (RDMA) for microservers is provided and includes generating queue pair (QPs) in a memory of an input/output (I/O) adapter of a microserver chassis having a plurality of compute nodes executing thereon, the QPs being associated with a remote direct memory access (RDMA) connection between a first compute node and a second compute node in the microserver chassis, setting a flag in the QPs to indicate that the RDMA connection is local to the microserver chassis, and performing a loopback of RDMA packets within the I/O adapter from one memory region in the I/O adapter associated with the first compute node of the RDMA connection to another memory region in the I/O adapter associated with the second compute node of the RDMA connection.
Abstract:
An example method for facilitating multi-level paging and address translation in a network environment is provided and includes receiving a request for memory in a physical memory of a network element, associating the request with a first virtual address space, mapping a memory region located in the physical memory to a first window in the first virtual address space, the memory region being also mapped to a second window in a different, second virtual address space, remapping the first window in the first virtual address space to the second window in the second virtual address space, and responding to the request with addresses of the second window in the second virtual address space.
Abstract:
An example method for facilitating multi-level paging and address translation in a network environment is provided and includes receiving a request for memory in a physical memory of a network element, associating the request with a first virtual address space, mapping a memory region located in the physical memory to a first window in the first virtual address space, the memory region being also mapped to a second window in a different, second virtual address space, remapping the first window in the first virtual address space to the second window in the second virtual address space, and responding to the request with addresses of the second window in the second virtual address space.
Abstract:
An example method for transformation of Peripheral Component Interconnect Express (PCIe) compliant virtual devices in a server in a network environment is provided and includes receiving, during runtime of the server, a request to change a first configuration of a PCIe compliant virtual device to a different second configuration, identifying a bridge on a PCIe topology below which the virtual device is located, issuing a simulated secondary bus reset to the bridge, the virtual device being reconfigured according to the change in configuration after the simulated secondary bus reset is issued, re-enumerating below the bridge after the change in configuration completes without rebooting the server, and updating the PCI topology with the virtual device in the second configuration. A virtual interface card adapter traps the simulated secondary bus reset, removes the virtual device from the PCI topology, and reconfigures the virtual device from the first configuration to the second configuration.
Abstract:
Techniques for sending Compute Express Link (CXL) packets over Ethernet (CXL-E) in a composable data center that may include disaggregated, composable servers. The techniques may include receiving, from a first server device, a request to bind the first server device with a multiple logical device (MLD) appliance. Based at least in part on the request, a first CXL-E connection may be established for the first server device to export a computing resource to the MLD appliance. The techniques may also include receiving, from the MLD appliance, an indication that the computing resource is available, and receiving, from a second server device, a second request for the computing resource. Based at least in part on the second request, a second CXL-E connection may be established for the second server device to consume or otherwise utilize the computing resource of the first server device via the MLD appliance.
Abstract:
A method is provided by which a network adapter device receives a packet sent over a network from a peer, the packet including an enqueue timestamp indicating when the packet has been enqueued at the network adapter device. The network adapter device parses a header of the packet to detect whether the header includes bits indicating that the peer device is experiencing congestion, and obtains packet metadata of the packet and the enqueue timestamp of the packet. The network adapter device compares the packet metadata with information in a flow table to identify an entry in the flow table corresponding to a flow to which the packet metadata matches. The network adapter device sets a timer associated with the flow, the timer for use in scheduling transmission of a next packet provided by the host to be sent to the peer.
Abstract:
Systems and methods provide for optimizing utilization of an Address Translation Cache (ATC). A network interface controller (NIC) can write information reserving one or more cache lines in a first level of the ATC to a second level of the ATC. The NIC can receive a request for a direct memory access (DMA) to an untranslated address in memory of a host computing system. The NIC can determine that the untranslated address is not cached in the first level of the ATC. The NIC can identify a selected cache line in the first level of the ATC to evict using the request and the second level of the ATC. The NIC can receive a translated address for the untranslated address. The NIC can cache the untranslated address in the selected cache line. The NIC can perform the DMA using the translated address.
Abstract:
An example method for facilitating policy-driven storage in a microserver computing environment is provided and includes receiving, at an input/output (I/O) adapter in a microserver chassis having a plurality of compute nodes and a shared storage resource, policy contexts prescribing storage access parameters of respective compute nodes and enforcing the respective policy contexts on I/O operations by the compute nodes, in which respect a particular I/O operation by any compute node is not executed if the respective policy context does not allow the particular I/O operation. The method further includes allocating tokens to command descriptors associated with I/O operations for accessing the shared storage resource, identifying a violation of any policy context of any compute node based on availability of the tokens, and throttling I/O operations by other compute nodes until the violation disappears.
Abstract:
A method is provided by which a network adapter device receives a packet sent over a network from a peer, the packet including an enqueue timestamp indicating when the packet has been enqueued at the network adapter device. The network adapter device parses a header of the packet to detect whether the header includes bits indicating that the peer device is experiencing congestion, and obtains packet metadata of the packet and the enqueue timestamp of the packet. The network adapter device compares the packet metadata with information in a flow table to identify an entry in the flow table corresponding to a flow to which the packet metadata matches. The network adapter device sets a timer associated with the flow, the timer for use in scheduling transmission of a next packet provided by the host to be sent to the peer.