Abstract:
A method for congestion control includes receiving at a destination computer a packet transmitted on a given flow, in accordance with a predefined transport protocol, through a network by a transmitting network interface controller (NIC) of a source computer, and marked by an element in the network with a forward congestion notification. Upon receiving the marked packet in a receiving NIC of the destination computer, a congestion notification packet (CNP) indicating a flow to be throttled is immediately queued for transmission from the receiving NIC through the network to the source computer. Upon receiving the CNP in the transmitting NIC, transmission of further packets on at least the flow indicated by the CNP from the transmitting NIC to the network is immediately throttled, and an indication of the given flow is passed from the transmitting NIC to a protocol processing software stack running on the source computer.
Abstract:
A method includes communicating between at least first and second devices over a bus in accordance with a bus address space, including providing direct access over the bus to a local address space of the first device by mapping at least some of the addresses of the local address space to the bus address space. In response to indicating, by the first device or the second device, that the second device requires to access a local address in the local address space that is not currently mapped to the bus address space, the local address is mapped to the bus address space, and the local address is accessed directly, by the second device, using the mapping.
Abstract:
In a fabric of network elements one network element has an object pool to be accessed stored in its memory. A request for atomic access to the object pool by another network element is carried out by transmitting the request through the fabric to the one network element, performing a remote direct memory access to a designated member of the object pool, atomically executing the request, and returning a result of the execution of the request through the fabric to the other network element.
Abstract:
A method for network access of remote memory directly from a local instruction stream using conventional loads and stores. In cases where network IO access (a network phase) cannot overlap a compute phase, a direct network access from the instruction stream greatly decreases latency in CPU processing. The network is treated as yet another memory that can be directly read from, or written to, by the CPU. Network access can be done directly from the instruction stream using regular loads and stores. Example scenarios where synchronous network access can be beneficial are SHMEM (symmetric hierarchical memory access) usages (where the program directly reads/writes remote memory), and scenarios where part of system memory (for example DDR) can reside over a network and made accessible by demand to different CPUs.
Abstract:
A method for data transfer includes receiving in a data transfer operation data to be written by a peripheral device to a specified virtual address in a random access memory (RAM) of a host computer. Upon receiving the data, it is detected that a page that contains the specified virtual address is marked as not present in a page table of the host computer. The peripheral device receives a notification that the page is not present and an estimate of a length of time that will be required to make the page available and selects a mode for handling of the data transfer operation depending upon the estimate.
Abstract:
A network interface includes a host interface for communicating with a node, and circuitry which is configured to communicate with one or more other nodes over a communication network so as to carry out, jointly with one or more other nodes, a redundant storage operation that includes a redundancy calculation, including performing the redundancy calculation on behalf of the node.
Abstract:
A method for communication includes posting, by a software process, a set of buffers in a memory of a host processor and creating in the memory a list of labels associated respectively with the buffers. The software process pushes a first part of the list to a network interface controller (NIC), while retaining a second part of the list in the memory under control of the software process. Upon receiving a message containing a label, sent over a network, the NIC compares the label to the labels in the first part of the list and, upon finding a match to the label, writes data conveyed by the message to a buffer in the memory. Upon a failure to find the match in the first part of the list, the NIC passes the message from the NIC to the software process for handling using the second part of the list.
Abstract:
A network interface device includes a host interface for connection to a host processor having a memory. A network interface is configured to transmit and receive data packets over a data network, which supports multiple tenant networks overlaid on the data network. Processing circuitry is configured to receive, via the host interface, a work item submitted by a virtual machine running on the host processor, and to identify, responsively to the work item, a tenant network over which the virtual machine is authorized to communicate, wherein the work item specifies a message to be sent to a tenant destination address. The processing circuitry generates, in response to the work item, a data packet containing an encapsulation header that is associated with the tenant network, and to transmit the data packet over the data network to at least one data network address corresponding to the specified tenant destination address.
Abstract:
A method for communication includes receiving at a receiving node over a network from a sending node a succession of data packets belonging to a sequence of transactions, including at least one or more first packets belonging to a first transaction and one or more second packets belonging to a second transaction executed by the sending node after the first transaction, wherein at least one of the second packets is received at the receiving node before at least one of the first packets. At the receiving node, upon receipt of the data packets, data are written from the data packets in the succession to respective locations in a buffer. Execution of the second transaction at the receiving node is delayed until all of the first packets have been received and the first transaction has been executed at the receiving node.
Abstract:
A network interface device includes a host interface for connection to a host processor having a memory. A network interface is configured to transmit and receive data packets over a data network, which supports multiple tenant networks overlaid on the data network. Processing circuitry is configured to receive, via the host interface, a work item submitted by a virtual machine running on the host processor, and to identify, responsively to the work item, a tenant network over which the virtual machine is authorized to communicate, wherein the work item specifies a message to be sent to a tenant destination address. The processing circuitry generates, in response to the work item, a data packet containing an encapsulation header that is associated with the tenant network, and to transmit the data packet over the data network to at least one data network address corresponding to the specified tenant destination address.