Abstract:
Examples described herein relate to technologies for reliable packet transmission. In some examples, a network interface includes circuitry to: receive a request to transmit a packet to a destination device, select a path for the packet, provide a path identifier identifying one of multiple paths from the network interface to a destination and Path Sequence Number (PSN) for the packet, wherein the PSN is to identify a packet transmission order over the selected path, include the PSN in the packet, and transmit the packet. In some examples, if the packet is a re-transmit of a previously transmitted packet, the circuitry is to: select a path for the re-transmit packet, and set a PSN of the re-transmit packet that is a current packet transmission number for the selected path for the re-transmit packet. In some examples, a network interface includes circuitry to process a received packet to at least determine a Path Sequence Number (PSN) for the received packet, wherein the PSN is to provide an order of packet transmissions for a path associated with the received packet, process a second received packet to at least determine its PSN, and based on the PSN of the second received packet not being a next sequential value after the PSN of the received packet, cause transmission of a re-transmit request to a sender of the packet and the second packet.
Abstract:
Examples described herein relate to a first graphics processing unit (GPU) with at least one integrated communications system, wherein the at least one integrated communications system is to apply a reliability protocol to communicate with a second at least one integrated communications system associated with a second GPU to copy data from a first memory region to a second memory region and wherein the first memory region is associated with the first GPU and the second memory region is associated with the second GPU.
Abstract:
Method, apparatus, and systems for implementing Quality of Service (QoS) within high performance fabrics. A multi-level QoS scheme is implemented including virtual fabrics, Traffic Classes, Service Levels (SLs), Service Channels (SCs) and Virtual Lanes (VLs). SLs are implemented for Layer 4 (Transport Layer) end-to-end transfer of fabric packets, while SCs are used to differentiate fabric packets at the Link Layer. Fabric packets are divided into flits, with fabric packet data transmitted via fabric links as flits streams. Fabric switch input ports and device receive ports detect SC IDs for received fabric packets and implement SC-to-VL mappings to determine VL buffers to buffer fabric packet flits in. An SL may have multiple SCs, and SC-to-SC mapping may be implemented to change the SC for a fabric packet as it is forwarded through the fabric, while maintaining its SL. A Traffic Class may include multiple SLs, enabling request and response traffic for an application to employ separate SLs.
Abstract:
Examples described herein relate to receiving, at a network interface, an allocation of a first group of one or more buffers to store data to be processed by a Message Passing Interface (MPI) and based on a received packet including an indicator that permits the network interface to select a buffer for the received packet and store the received packet in the selected buffer, the network interface storing a portion of the received packet in a buffer of the first group of the one or more buffers. The indicator can permit the network interface to select a buffer for the received packet and store the received packet in the selected buffer irrespective of a tag and sender associated with the received packet. In some examples, based on a received packet including an indicator that does not permit storage of the received packet in a buffer irrespective of a tag and source associated with the second received packet, the network interface is to store a portion of the second received packet in a buffer of the second group of one or more buffers, wherein the buffer of the second group of one or more buffers corresponds to a tag and source associated with the second received packet.
Abstract:
Method, apparatus, and systems for implementing flexible credit exchange within high performance fabrics. Available buffer space in a receive buffer on a receive-side of a link is managed and tracked at the transmit-side of the link using credits. Peer link interfaces coupled via a link are provided with receive buffer configuration information that specifies how the receive buffer space in each peer is partitioned and space allocated for each buffer, including a plurality of virtual lane (VL) buffers. Credits are used for tracking buffer space consumption and in credits are returned from the receive-side indicating freed buffer space. The peer link interfaces exchange credit organization information to inform the other peer of how much space each credit represents. In connection with data transfer over the link, the transmit-side de-allocates credits based on an amount of buffer space to be consumed in applicable buffers in the receive buffer. Upon space being freed in the receive buffer, the receive-side returns credit ACKnowledgements (ACKs) identifying a VL for which space has been freed.
Abstract:
Method, apparatus, and systems for implementing Quality of Service (QoS) within high performance fabrics. A multi-level QoS scheme is implemented including virtual fabrics, Traffic Classes, Service Levels (SLs), Service Channels (SCs) and Virtual Lanes (VLs). SLs are implemented for Layer 4 (Transport Layer) end-to-end transfer of fabric packets, while SCs are used to differentiate fabric packets at the Link Layer. Fabric packets are divided into flits, with fabric packet data transmitted via fabric links as flits streams. Fabric switch input ports and device receive ports detect SC IDs for received fabric packets and implement SC-to-VL mappings to determine VL buffers to buffer fabric packet flits in. An SL may have multiple SCs, and SC-to-SC mapping may be implemented to change the SC for a fabric packet as it is forwarded through the fabric, while maintaining its SL. A Traffic Class may include multiple SLs, enabling request and response traffic for an application to employ separate SLs.
Abstract:
Method, apparatus, and systems for reliably transferring Ethernet packet data over a link layer and facilitating fabric-to-Ethernet and Ethernet-to-fabric gateway operations at matching wire speed and packet data rate. Ethernet header and payload data is extracted from Ethernet frames received at the gateway and encapsulated in fabric packets to be forwarded to a fabric endpoint hosting an entity to which the Ethernet packet is addressed. The fabric packets are divided into flits, which are bundled in groups to form link packets that are transferred over the fabric at the Link layer using a reliable transmission scheme employing implicit ACKnowledgements. At the endpoint, the fabric packet is regenerated, and the Ethernet packet data is de-encapsulated. The Ethernet frames received from and transmitted to an Ethernet network are encoded using 64b/66b encoding, having an overhead-to-data bit ratio of 1:32. Meanwhile, the link packets have the same ratio, including one overhead bit per flit and a 14-bit CRC plus a 2-bit credit return field or sideband used for credit-based flow control.
Abstract:
Methods, apparatus, and systems for implementing hierarchical and lossless packet preemption and interleaving to reduce latency jitter in flow-controller packet-based networks. Fabric packets are divided into a plurality of data units, with data units for different fabric packets buffered in separate buffers. Data units are pulled from the buffers and added to a transmit stream in which groups of data units are interleaved. Upon receipt by a receiver, the groups of data units are separated out and buffered in separate buffers under which data units for the same fabric packets are grouped together. In one aspect, each buffer is associated with a respective virtual lane (VL), and the fabric packets are effectively transferred over fabric links using virtual lanes. VLs may have different levels of priority under which data units for fabric packets in higher-priority VLs may preempt fabric packets in lower-priority VLs. By transferring data units rather than entire packets, transmission of a packet can be temporarily paused in favor of a higher-priority packet. Multiple levels of preemption and interleaving in a nested manner are supported.
Abstract:
System, method, and apparatus for improving the performance of collective operations in High Performance Computing (HPC). Compute nodes in a networked HPC environment form collective groups to perform collective operations. A spanning tree is formed including the compute nodes and switches and links used to interconnect the compute nodes, wherein the spanning tree is configured such that there is only a single route between any pair of nodes in the tree. The compute nodes implement processes for performing the collective operations, which includes exchanging messages between processes executing on other compute nodes, wherein the messages contain indicia identifying collective operations they belong to. Each switch is configured to implement message forwarding operations for its portion of the spanning tree. Each of the nodes in the spanning tree implements a ratcheted cyclical state machine that is used for synchronizing collective operations, along with status messages that are exchanged between nodes. Transaction IDs are also used to detect out-of-order and lost messages.
Abstract:
Method, apparatus, and systems for reliably transferring Ethernet packet data over a link layer and facilitating fabric-to-Ethernet and Ethernet-to-fabric gateway operations at matching wire speed and packet data rate. Ethernet header and payload data is extracted from Ethernet frames received at the gateway and encapsulated in fabric packets to be forwarded to a fabric endpoint hosting an entity to which the Ethernet packet is addressed. The fabric packets are divided into flits, which are bundled in groups to form link packets that are transferred over the fabric at the Link layer using a reliable transmission scheme employing implicit ACKnowledgements. At the endpoint, the fabric packet is regenerated, and the Ethernet packet data is de-encapsulated. The Ethernet frames received from and transmitted to an Ethernet network are encoded using 64b/66b encoding, having an overhead-to-data bit ratio of 1:32. Meanwhile, the link packets have the same ratio, including one overhead bit per flit and a 14-bit CRC plus a 2-bit credit return field or sideband used for credit-based flow control.