Abstract:
An island-based network flow processor (IB-NFP) integrated circuit includes islands organized in rows. A configurable mesh event bus extends through the islands and is configured to form one or more local event rings and a global event chain. The configurable mesh event bus is configured with configuration information received via a configurable mesh control bus. Each local event ring involves event ring circuits and event ring segments. In one example, an event packet being communicated along a local event ring reaches an event ring circuit. The event ring circuit examines the event packet and determines whether it meets a programmable criterion. If the event packet meets the criterion, then the event packet is inserted into the global event chain. The global event chain communicates the event packet to a global event manager that logs events and maintains statistics and other information.
Abstract:
A reconfigurable, scalable and flexible island-based network flow processor integrated circuit architecture includes a plurality of rectangular islands of identical shape and size. The islands are disposed in rows, and a configurable mesh command/push/pull data bus extends through all the islands. The integrated circuit includes first SerDes I/O blocks, an ingress MAC island that converts incoming symbols into packets, an ingress NBI island that analyzes packets and generates ingress packet descriptors, a microengine (ME) island that receives ingress packet descriptors and headers from the ingress NBI and analyzes the headers, a memory unit (MU) island that receives payloads from the ingress NBI and performs lookup operations and stores payloads, an egress NBI island that receives the header portions and the payload portions and egress descriptors and performs egress scheduling, and an egress MAC island that outputs packets to second SerDes I/O blocks.
Abstract translation:可重构,可扩展和灵活的基于岛的网络流处理器集成电路架构包括多个相同形状和大小的矩形岛。 岛排列成行,并且可配置的网格命令/推/拉数据总线延伸穿过所有岛。 该集成电路包括第一个SerDes I / O块,一个将输入符号转换成数据包的入口MAC岛,一个分析数据包并产生入口包描述符的入口NBI岛,一个微型引擎(ME)岛,接收入口数据包描述符和头 入口NBI并分析头部,存储单元(MU)岛,其从入口NBI接收有效载荷并执行查找操作并存储有效载荷;接收标题部分和有效载荷部分和出口描述符并执行出口调度的出口NBI岛, 以及向第二SerDes I / O块输出数据包的出口MAC岛。
Abstract:
An island-based network flow processor (IB-NFP) integrated circuit includes islands organized in rows. A configurable mesh event bus extends through the islands and is configured to form a local event ring. The configurable mesh event bus is configured with configuration information received via a configurable mesh control bus. The local event ring involves event ring circuits and event ring segments. In one example, a packet is received onto a first island. If an amount of a processing resource (for example, memory buffer space) available to the first island is below a threshold, then an event packet is communicated from the first island to a second island via the local event ring. In response, the second island causes a third island to communicate via a command/push/pull data bus with the first island, thereby increasing the amount of the processing resource available to the first island for handing incoming packets.
Abstract:
An island-based integrated circuit includes a configurable mesh data bus. The data bus includes four meshes. Each mesh includes, for each island, a crossbar switch and radiating half links. The half links of adjacent islands align to form links between crossbar switches. A link is implemented as two distributed credit FIFOs. In one direction, a link portion involves a FIFO associated with an output port of a first island, a first chain of registers, and a second FIFO associated with an input port of a second island. When a transaction value passes through the FIFO and through the crossbar switch of the second island, an arbiter in the crossbar switch returns a taken signal. The taken signal passes back through a second chain of registers to a credit count circuit in the first island. The credit count circuit maintains a credit count value for the distributed credit FIFO.
Abstract:
An island-based network flow processor (IB-NFP) integrated circuit has a high performance processor island. The processor island has a processor and a tightly coupled memory. The integrated circuit also has another memory. The other memory may be internal or external memory. The header of an incoming packet is stored in the tightly coupled memory of the processor island. The payload is stored in the other memory. In one example, if the amount of a processing resource is below a threshold then the header is moved from the first island to the other memory before the header and payload are communicated to an egress island for outputting from the integrated circuit. If, however, the amount of the processing resource is not below the threshold then the header is moved directly from the processor island to the egress island and is combined with the payload there for outputting from the integrated circuit.
Abstract:
A network device includes a Network Interface Device (NID) and multiple servers. Each server is coupled to the NID via a corresponding PCIe bus. The NID has a network port through which it receives packets. The packets are destined for one of the servers. The NID detects a PCIe congestion condition regarding the PCIe bus to the server. Rather than transferring the packet across the bus, the NID buffers the packet and places a pointer to the packet in an overflow queue. If the level of bus congestion is high, the NID sets the packet's ECN-CE bit. When PCIe bus congestion subsides, the packet passes to the server. The server responds by returning an ACK whose ECE bit is set. The originating TCP endpoint in turn reduces the rate at which it sends data to the destination server, thereby reducing congestion at the PCIe bus interface within the network device.
Abstract:
A pipelined run-to-completion processor includes no instruction counter and only fetches instructions either: as a result of being prompted from the outside by an input data value and/or an initial fetch information value, or as a result of execution of a fetch instruction. Initially the processor is not clocking. An incoming value kick-starts the processor to start clocking and to fetch a block of instructions from a section of code in a table. The input data value and/or the initial fetch information value determines the section and table from which the block is fetched. A LUT converts a table number in the initial fetch information value into a base address where the table is found. Fetch instructions at the ends of sections of code cause program execution to jump from section to section. A finished instruction causes an output data value to be output and stops clocking of the processor.
Abstract:
A VIRTIO Relay Program allows packets to be transferred from a Network Interface Device (NID), across a PCIe bus to a host, and to a virtual machine executing on the host. Rather than an OvS switch subsystem of the host making packet switching decisions, switching rules are transferred to the NID and the NID makes packet switching decisions. Transfer of a packet from the NID to the host occurs across an SR-IOV compliant PCIe virtual function and into host memory. Transfer from that memory and into memory space of the virtual machine is a VIRTIO transfer. This relaying of the packet occurs in no more than two read/write transfers without the host making any packet steering decision based on any packet header. Packet counts/statistics for the switched flow are maintained by the OvS switch subsystem just as if it were the subsystem that had performed the packet switching.
Abstract:
A flow of packets is communicated through a data center. The data center includes multiple racks, where each rack includes multiple network devices. A group of packets of the flow is received onto an integrated circuit located in a first network device. The integrated circuit includes a neural network. The neural network analyzes the group of packets and in response outputs a neural network output value. The neural network output value is used to determine how the packets of the flow are to be output from a second network device. In one example, each packet of the flow output by the first network device is output along with a tag. The tag is indicative of the neural network output value. The second device uses the tag to determine which output port located on the second device is to be used to output each of the packets.
Abstract:
A dispatcher circuit receives sets of instructions from an instructing entity. Instructions of the set of a first type are put into a first queue circuit, instructions of the set of a second type are put into a second queue circuit, and so forth. The first queue circuit dispatches instructions of the first type to one or more processing engines and records when the instructions of the set are completed. When all the instructions of the set of the first type have been completed, then the first queue circuit sends the second queue circuit a go signal, which causes the second queue circuit to dispatch instructions of the second type and to record when they have been completed. This process proceeds from queue circuit to queue circuit. When all the instructions of the set have been completed, then the dispatcher circuit returns an “instructions done” to the original instructing entity.