Abstract:
In a data network congestion control in a virtualized environment is enforced in packet flows to and from virtual machines in a host. A hypervisor and network interface hardware in the host are trusted components. Enforcement comprises estimating congestion states in the data network attributable to respective packet flows, recognizing a new packet that belongs to one of the data packet flows, and using one or more of the trusted components and to make a determination based on the congestion states that the new packet belongs to a congestion-producing packet flow. A congestion-control policy is applied by one or more of the trusted components to the new packet responsively to the determination.
Abstract:
A Network Interface (NI) includes a host interface, which is configured to receive from a host processor of a node one or more cross-channel work requests that are derived from an operation to be executed by the node. The NI includes a plurality of work queues for carrying out transport channels to one or more peer nodes over a network. The NI further includes control circuitry, which is configured to accept the cross-channel work requests via the host interface, and to execute the cross-channel work requests using the work queues by controlling an advance of at least a given work queue according to an advancing condition, which depends on a completion status of one or more other work queues, so as to carry out the operation.
Abstract:
A network interface device includes a host interface for connection to a host processor having a memory. A network interface is configured to transmit and receive data packets over a data network, which supports multiple tenant networks overlaid on the data network. Processing circuitry is configured to receive, via the host interface, a work item submitted by a virtual machine running on the host processor, and to identify, responsively to the work item, a tenant network over which the virtual machine is authorized to communicate, wherein the work item specifies a message to be sent to a tenant destination address. The processing circuitry generates, in response to the work item, a data packet containing an encapsulation header that is associated with the tenant network, and to transmit the data packet over the data network to at least one data network address corresponding to the specified tenant destination address.
Abstract:
A method for memory access includes maintaining in a host memory, under control of a host operating system running on a central processing unit (CPU), respective address translation tables for multiple processes executed by the CPU. Upon receiving, in a peripheral device, a work item that is associated with a given process, having a respective address translation table in the host memory, and specifies a virtual memory address, the peripheral device translates the virtual memory address into a physical memory address by accessing the respective address translation table of the given process in the host memory. The work item is executed in the peripheral device by accessing data at the physical memory address in the host memory.
Abstract:
Computing apparatus includes a host computer, including multiple non-uniform memory access (NUMA) nodes, including at least first and second NUMA nodes, which include first and second local memories and first and second host bus interfaces for connection to first and second peripheral component buses, respectively. A network interface controller (NIC) is to receive a definition of a memory region extending over respective first and second parts of the first and second local memories and to receive a memory mapping with respect to the memory region that is applicable to both the first and second local memories, and to apply the memory mapping in writing data to the memory region via first and second NIC bus interfaces in a sequence of direct memory access (DMA) transactions to the respective first and second parts of the first and second local memories in response to packets received through a network port.
Abstract:
A device and method may alter the transmission rate of data sent across a computer network based on a time to receive an acknowledgement in response to a packet sent over the network. An embodiment may transmit packets across the computer network according to a rate R, where R is determined based at least on a number of bytes to be sent during a window (cwnd) divided by a duration of time (RTT); and modify RTT based on a current round trip time of a packet sent over the network (e.g. based on a time to receive an acknowledgement in response to a packet sent over the network).
Abstract:
A method for data transfer includes transmitting a sequence of data packets from a first computer over a network to a second computer in a single RDMA data transfer transaction. Upon receipt of a second packet in the sequence without previously having received the first packet, the second computer sends a NAK packet over the network to the first computer, indicating that the first packet was not received. A retransmission mode is selected responsively to the type of the transaction, such that when the transaction is of a first type, the first packet is retransmitted from the first computer to the second computer in response to the NAK packet without retransmitting the second packet, and when the transaction is of a second type, both the first and second packets are retransmitted from the first computer to the second computer in response to the NAK packet.
Abstract:
A communication apparatus includes a host interface, connected to a peripheral component bus so as to communicate with a CPU and a memory of a host computer. A network interface is connected to a network. Packet processing circuitry is configured to receive from a first interface a data packet including a set of one or more headers that include header fields having respective values, to identify, responsively to at least one of the header fields, a corresponding entry in a header modification table that specifies a header modification operation, to modify the set of headers in accordance with the header modification operation, to check whether the entry specifies an additional header modification operation, to output the modified set of headers if the entry does not specify an additional header modification operation, and, if the entry specifies an additional header modification operation, to feed-back the modified set of headers.
Abstract:
A computing system includes at least one peripheral bus, a peripheral device connected to the at least one peripheral bus, at least one memory, and first and second system components. The first system component is (i) associated with a first address space in the at least one memory and (ii) connected to the peripheral device via the at least one peripheral bus. The second system component is (i) associated with a second address space in the at least one memory and (ii) connected to the peripheral device via the at least one peripheral bus. The first system component is arranged to cause the peripheral device to access the second address space that is associated with the second system component.
Abstract:
A network adapter includes circuitry and one or more ports. The ports connect to a communication network including multiple network elements. The circuitry accesses outbound messages that are pending to be sent over the communication network to multiple remote nodes via the ports. At least some of the outbound messages request the remote nodes to send respective amounts of data back to the network adapter. Based on the amounts of data requested by the outbound messages, the circuitry forecasts a bandwidth of inbound response traffic, which is expected to traverse a selected network element in response to the outbound messages toward the network adapter, determines a schedule for transmitting the outbound messages to the remote nodes so that the forecasted bandwidth meets a bandwidth supported by the selected network element, and transmits the outbound messages to the remote nodes in accordance with the determined schedule.