Abstract:
A Software-Defined Networking (SDN) switch includes external network ports for receiving external network traffic onto the SDN switch, external network ports for transmitting external network traffic out of the SDN switch, a first Network Flow Switch (NFX) integrated circuit that has multiple network ports and that maintains a first flow table, another Network Flow Switch (NFX) integrated circuit that has multiple network ports and that maintains a second flow table, a Network Flow Processor (NFP) circuit that maintains a third flow table, and a controller processor circuit that maintains a fourth flow table. The controller processor circuit is coupled by a serial bus to the NFP circuit but is not directly coupled by any network port to either the NFP circuit nor the first NFX integrated circuit nor the second NFX integrated circuit.
Abstract:
A method involves a Software-Defined Networking (SDN) switch that includes multiple Network Flow Switch (NFX) integrated circuits, a Network Flow Processor (NFP) circuit, and a controller processor. The controller processor is coupled to the NFP circuit by a serial bus. A flow table is maintained on each of the NFX integrated circuits. A SDN flow table is maintained on the NFP circuit. A copy of each of the flow tables is maintained on the NFP circuit. Another SDN flow table is maintained on the controller processor. A SDN protocol stack is executed on the controller processor. A SDN protocol message is received onto the SDN switch via one of the NFX integrated circuits. The SDN protocol message is communicated across a network link to the NFP circuit, and across the serial bus from the NFP circuit to the controller processor such that the SDN protocol message is received and processed by the SDN protocol stack executing on the controller processor.
Abstract:
An island-based integrated circuit includes a configurable mesh data bus. The data bus includes four meshes. Each mesh includes, for each island, a crossbar switch and radiating half links. The half links of adjacent islands align to form links between crossbar switches. A link is implemented as two distributed credit FIFOs. In one direction, a link portion involves a FIFO associated with an output port of a first island, a first chain of registers, and a second FIFO associated with an input port of a second island. When a transaction value passes through the FIFO and through the crossbar switch of the second island, an arbiter in the crossbar switch returns a taken signal. The taken signal passes back through a second chain of registers to a credit count circuit in the first island. The credit count circuit maintains a credit count value for the distributed credit FIFO.
Abstract:
An exact-match flow table structure of an integrated circuit stores flow entries. Each flow entry includes a Flow Id and an action value. Each Flow Id is a multi-bit digital value that uniquely identifies a flow. A Flow Id does not include any wildcard indictor. The flow table structure cannot and does not store an indicator that any particular part of a packet should be matched against any part of a Flow Id. In one example, a packet is received onto the integrated circuit. A Flow Id is generated from the packet. If the flow table structure determines that the Flow Id is a bit-by-bit exact-match of any Flow Id of any stored flow entry, then the packet is handled according to the action value of the flow entry. If, on the other hand, there is not exact-match, then a miss indication is output from the integrated circuit.
Abstract:
A multi-processor includes a pool of processors and a common packet buffer memory. Bytes of packet data of a packet are stored in the packet buffer memory. Each of the processors has an intelligent packet data register file. One processor is tasked with processing the packet data, and its packet data register file caches a subset of the bytes of packet data. Some instructions when executed require that the packet data register file supply the execute stage of the processor with certain bytes of the packet data. If during instruction execution the intelligent packet data register file determines that it does not store some of the necessary bytes, then the register file asserts a stall signal thereby stalling the processor, and retrieves the bytes from the packet buffer memory, and then supplies the retrieved bytes to the execute stage, and de-asserts the stall signal to unstall the processor.
Abstract:
An egress packet modifier includes a script parser and a pipeline of processing stages. Rather than performing egress modifications using a processor that fetches and decodes and executes instructions in a classic processor fashion, and rather than storing a packet in memory and reading it out and modifying it and writing it back, the packet modifier pipeline processes the packet by passing parts of the packet through the pipeline. A processor identifies particular egress modifications to be performed by placing a script code at the beginning of the packet. The script parser then uses the code to identify a specific script of opcodes, where each opcode defines a modification. As a part passes through a stage, the stage can carry out the modification of such an opcode. As realized using current semiconductor fabrication process, the packet modifier can modify 200M packets/second at a sustained rate of up to 100 gigabits/second.
Abstract:
A transactional memory (TM) receives an Atomic Look-up, Add and Lock (ALAL) command across a bus from a client. The command includes a first value. The TM pulls a second value. The TM uses the first value to read a set of memory locations, and determines if any of the locations contains the second value. If no location contains the second value, then the TM locks a vacant location, adds the second value to the vacant location, and sends a result to the client. If a location contains the second value and it is not locked, then the TM locks the location and returns a result to the client. If a location contains the second value and it is locked, then the TM returns a result to the client. Each location has an associated data structure. Setting the lock field of a location locks access to its associated data structure.
Abstract:
A method of Software-Defined Networking (SDN) switching. A packet of a flow is received onto a SDN switch via a NFX circuit. The NFX circuit determines that the packet matches a flow entry stored in any flow table in the NFX circuit, counts the number of packets of the flow received, and determines that the number of packets of the flow received is above a threshold value. The NFX circuit then forwards the packet to a NFP circuit in the SDN switch. The NFP circuit determines that the packet matches a flow entry stored in the flow table in the NFX and generates a new flow entry that applies to a relatively narrow subflow of packets that is forwarded to and stored the flow table in the NFX circuit. A subsequent packet of the flow is switched by the SDN switch without forwarding the packet to the NFP.
Abstract:
In response to receiving a novel “Return Available PPI Credits” command from a credit-aware device, a packet engine sends a “Credit To Be Returned” (CTBR) value it maintains for that device back to the credit-aware device, and zeroes out its stored CTBR value. The credit-aware device adds the credits returned to a “Credits Available” value it maintains. The credit-aware device uses the “Credits Available” value to determine whether it can issue a PPI allocation request. The “Return Available PPI Credits” command does not result in any PPI allocation or de-allocation. In another novel aspect, the credit-aware device is permitted to issue one PPI allocation request to the packet engine when its recorded “Credits Available” value is zero or negative. If the PPI allocation request cannot be granted, then it is buffered in the packet engine, and is resubmitted within the packet engine, until the packet engine makes the PPI allocation.
Abstract:
A processor includes a hash register and a hash generating circuit. The hash generating circuit includes a novel programmable nonlinearizing function circuit as well as a modulo-2 multiplier, a first modulo-2 summer, a modulor-2 divider, and a second modulo-2 summer. The nonlinearizing function circuit receives a hash value from the hash register and performs a programmable nonlinearizing function, thereby generating a modified version of the hash value. In one example, the nonlinearizing function circuit includes a plurality of separately enableable S-box circuits. The multiplier multiplies the input data by a programmable multiplier value, thereby generating a product value. The first summer sums a first portion of the product value with the modified hash value. The divider divides the resulting sum by a fixed divisor value, thereby generating a remainder value. The second summer sums the remainder value and the second portion of the input data, thereby generating a hash result.