摘要:
Methods, apparatus and systems for facilitating explicit flow control for RDMA transfers using implicit memory registration. To setup an RDMA data transfer, a source RNIC sends a request to allocate a destination buffer at a destination RNIC using implicit memory registration. Under implicit memory registration, the page or pages to be registered are not explicitly identified by the source RNIC, and may correspond to pages that are paged out to virtual memory. As a result, registration of such pages result in page faults, leading to a page fault delay before registration and pinning of the pages is completed. In response to detection of a page fault, the destination RNIC returns an acknowledgment indicating that a page fault delay is occurring. In response to receiving the acknowledgment, the source RNIC temporarily stops sending packets, and does not retransmit packets for which ACKs are not received prior to retransmission timeout expiration.
摘要:
A method and device for local area network (LAN) emulation over an Infiniband (IB) fabric. An IB LAN driver at a first node on an IB fabric receives the port and associated local identifier (LID) of one or more remote peer nodes on the IB fabric. An IEEE 802.3 Ethernet MAC address with one LID imbedded is generated. The imbedded LID is for one or more remote peer nodes. The IB LAN driver sends the Ethernet MAC address to an Address Resolution Protocol (ARP). A logical address of a remote peer node is generated by a network protocol. The logical address is mapped to an Ethernet MAC address. The IB LAN driver sends the Ethernet MAC address onto the IB fabric to the one or more remote peer nodes. The remote peer nodes appear to reside on an Ethernet network to the network protocol.
摘要:
A method and device for local area network (LAN) emulation over an Infiniband (IB) fabric. An IB LAN driver at a first node on an IB fabric receives the port and associated local identifier (LID) of one or more remote peer nodes on the IB fabric. An IEEE 802.3 Ethernet MAC address with one LID imbedded is generated. The imbedded LID is for one or more remote peer nodes. The IB LAN driver sends the Ethernet MAC address to an Address Resolution Protocol (ARP). A logical address of a remote peer node is generated by a network protocol. The logical address is mapped to an Ethernet MAC address. The IB LAN driver sends the Ethernet MAC address onto the IB fabric to the one or more remote peer nodes. The remote peer nodes appear to reside on an Ethernet network to the network protocol.
摘要:
An apparatus and method for efficient input/output processing without the use of interrupts is described. The apparatus includes a plurality of descriptors where each descriptor includes a completion indicator and data associated with an input/output request. The plurality of descriptors includes a head descriptor and a tail descriptor. The apparatus further include a plurality of address holders associated with an input/output processor, and each the plurality of address holders is uniquely affiliated with one of the plurality of descriptors. The apparatus further include a polling mechanism for evaluating the completion indicator of the head descriptor and a completion processor for interfacing with the head descriptor. Finally, the apparatus includes connectors between the tail descriptor and address holder and between the input/output processor and the head descriptor.
摘要:
A method and device for local area network (LAN) emulation over an Infiniband (IB) fabric. An IB LAN driver at a first node on an IB fabric receives the port and associated local identifier (LID) of one or more remote peer nodes on the IB fabric. An IEEE 802.3 Ethernet MAC address with one LID imbedded is generated. The imbedded LID is for one or more remote peer nodes. The IB LAN driver sends the Ethernet MAC address to an Address Resolution Protocol (ARP). A logical address of a remote peer node is generated by a network protocol. The logical address is mapped to an Ethernet MAC address. The IB LAN driver sends the Ethernet MAC address onto the IB fabric to the one or more remote peer nodes. The remote peer nodes appear to reside on an Ethernet network to the network protocol.
摘要:
A method and device for local area network (LAN) emulation over an Infiniband (IB) fabric. An IB LAN driver at a first node on an IB fabric receives the port and associated local identifier (LID) of one or more remote peer nodes on the IB fabric. An IEEE 802.3 Ethernet MAC address with one LID imbedded is generated. The imbedded LID is for one or more remote peer nodes. The IB LAN driver sends the Ethernet MAC address to an Address Resolution Protocol (ARP). A logical address of a remote peer node is generated by a network protocol. The logical address is mapped to an Ethernet MAC address. The IB LAN driver sends the Ethernet MAC address onto the IB fabric to the one or more remote peer nodes. The remote peer nodes appear to reside on an Ethernet network to the network protocol.
摘要:
A method and device for local area network (LAN) emulation over an Infiniband (IB) fabric. An IB LAN driver at a first node on an IB fabric receives the port and associated local identifier (LID) of one or more remote peer nodes on the IB fabric. An IEEE 802.3 Ethernet MAC address with one LID imbedded is generated. The imbedded LID is for one or more remote peer nodes. The IB LAN driver sends the Ethernet MAC address to an Address Resolution Protocol (ARP). A logical address of a remote peer node is generated by a network protocol. The logical address is mapped to an Ethernet MAC address. The IB LAN driver sends the Ethernet MAC address onto the IB fabric to the one or more remote peer nodes. The remote peer nodes appear to reside on an Ethernet network to the network protocol.
摘要:
A channel based network is provided that allows one or more hosts to communicate with one or more remote fabric attached I/O units. A split-model network driver includes a host module driver and I/O unit module driver. The host module driver and the I/O unit module driver each includes a messaging layer that allows the hosts and I/O units to communicate over the switched fabric using a push-push messaging protocol. For a host to send data, the host either initiates a RDMA write to a pre-registered buffer or initiates a message Send to a pre-posted buffer on the target. For the RDMA case, the initiator would have to send the target some form of transfer indication specifying where the data has been written. This notification can be done with either a separate message or more preferably with immediate data that is included with the RDMA write.
摘要:
Methods, apparatus, and software for optimizing network data flows within constrained systems. The methods enable data to be transferred between PCIe cards in multi-socket server platforms, each platform including a local socket having an InfiniBand (IB) HCA and a remote socket. Data to be transmitted outbound from a platform is transferred from a PCIe card to the platform's IB HCA via a proxied datapath. Data received at a platform may employ a direct PCIe peer-to-peer (P2P) transfer if the destined PCIe card is installed in the local socket or via a proxied datapath if the destined PCIe card is installed in a remote socket. Outbound transfers from a PCIe card in a local socket to the platform's IB HCA may selectively be transferred using an either a proxied data path for larger data transfers or a direct P2P datapath for smaller data transfers. The software is configured to support each of local-local, remote-local, local-remote, and remote-remote data transfers in a manner that is transparent to the software applications generating and receiving the data.
摘要:
In an example embodiment, a method of reading data from a remote device transfers data directly from the remote memory of the remote device to the local memory of the local device. A message is sent from the local device to the remote device which includes a transport header indicating the message type of the message. The remote device processes the message to determine whether or not the transport header of the message identifies the message as a type of remote Direct Memory Access (rDMA) read operation. If the message is that type of remote Direct Memory Access (rDMA) read operation, then the remote device performs a remote Direct Memory Access (rDMA) write operation to the local device in accordance with data elements included in the message.