Abstract:
Within a networking device, packet portions from multiple PDRSDs (Packet Data Receiving and Splitting Devices) are loaded into a single memory, so that the packet portions can later be processed by a processing device. Rather than the PDRSDs managing the storing of packet portions into the memory, a packet engine is provided. The PDRSDs use a PPI addressing mode in communicating with the packet engine and in instructing the packet engine to store packet portions. A PDRSD requests a PPI from the packet engine, and is allocated a PPI by the packet engine, and then tags the packet portion to be written with the PPI and sends the packet portion and the PPI to the packet engine. Once the packet portion has been processed, a PPI de-allocation command causes the packet engine to de-allocate the PPI so that the PPI is available for allocating in association with another packet portion.
Abstract:
A processor includes a hash register and a hash generating circuit. The hash generating circuit includes a novel programmable nonlinearizing function circuit as well as a modulo-2 multiplier, a first modulo-2 summer, a modulor-2 divider, and a second modulo-2 summer. The nonlinearizing function circuit receives a hash value from the hash register and performs a programmable nonlinearizing function, thereby generating a modified version of the hash value. In one example, the nonlinearizing function circuit includes a plurality of separately enableable S-box circuits. The multiplier multiplies the input data by a programmable multiplier value, thereby generating a product value. The first summer sums a first portion of the product value with the modified hash value. The divider divides the resulting sum by a fixed divisor value, thereby generating a remainder value. The second summer sums the remainder value and the second portion of the input data, thereby generating a hash result.
Abstract:
A pipelined run-to-completion processor executes a conditional skip instruction. If a predicate condition as specified by a predicate code field of the skip instruction is true, then the skip instruction causes execution of a number of instructions following the skip instruction to be “skipped”. The number of instructions to be skipped is specified by a skip count field of the skip instruction. In some examples, the skip instruction includes a “flag don't touch” bit. If this bit is set, then neither the skip instruction nor any of the skipped instructions can change the values of the flags. Both the skip instruction and following instructions to be skipped are decoded one by one in sequence and pass through the processor pipeline, but the execution stage is prevented from carrying out the instruction operation of a following instruction if the predicate condition of the skip instruction was true.
Abstract:
A pipelined run-to-completion processor includes no instruction counter and only fetches instructions either: as a result of being prompted from the outside by an input data value and/or an initial fetch information value, or as a result of execution of a fetch instruction. Initially the processor is not clocking. An incoming value kick-starts the processor to start clocking and to fetch a block of instructions from a section of code in a table. The input data value and/or the initial fetch information value determines the section and table from which the block is fetched. A LUT converts a table number in the initial fetch information value into a base address where the table is found. Fetch instructions at the ends of sections of code cause program execution to jump from section to section. A finished instruction causes an output data value to be output and stops clocking of the processor.
Abstract:
A pipelined run-to-completion processor includes no instruction counter and only fetches instructions either: as a result of being prompted from the outside by an input data value and/or an initial fetch information value, or as a result of execution of a fetch instruction. Initially the processor is not clocking. An incoming value kick-starts the processor to start clocking and to fetch a block of instructions from a section of code in a table. The input data value and/or the initial fetch information value determines the section and table from which the block is fetched. A LUT converts a table number in the initial fetch information value into a base address where the table is found. Fetch instructions at the ends of sections of code cause program execution to jump from section to section. A finished instruction causes an output data value to be output and stops clocking of the processor.
Abstract:
A Network Interface Device (NID) of a web hosting server implements multiple virtual NIDs. A virtual NID is configured by configuration information in an appropriate one of a set of smaller blocks in a high-speed memory on the NID. There is a smaller block for each virtual NID. A virtual machine on the host can configure its virtual NID by writing configuration information into a larger block in PCIe address space. Circuitry on the NID detects that the PCIe write is into address space occupied by the larger blocks. If the write is into this space, then address translation circuitry converts the PCIe address into a smaller address that maps to the appropriate one of the smaller blocks associated with the virtual NID to be configured. If the PCIe write is detected not to be an access of a larger block, then the NID does not perform the address translation.
Abstract:
A transactional memory (TM) includes a control circuit pipeline and an associated memory unit. The memory unit stores a plurality of rings. The pipeline maintains, for each ring, a head pointer and a tail pointer. A ring operation stage of the pipeline maintains the pointers as values are put onto and are taken off the rings. A put command causes the TM to put a value into a ring, provided the ring is not full. A get command causes the TM to take a value off a ring, provided the ring is not empty. A put with low priority command causes the TM to put a value into a ring, provided the ring has at least a predetermined amount of free buffer space. A get from a set of rings command causes the TM to get a value from the highest priority non-empty ring (of a specified set of rings).
Abstract:
A source code symbol can be declared to have a scope level indicative of a level in a hierarchy of scope levels, where the scope level indicates a circuit level or a sub-circuit level in the hierarchy. A novel instruction to the linker can define the symbol to be of a desired scope level. Location information indicates where different amounts of the object code are to be loaded into a system. A novel linker program uses the location information, along with the scope level information of the symbol, to uniquify instances of the symbol if necessary to resolve name collisions of symbols having the same scope. After the symbol uniquification step, the linker performs resource allocation. A resource instance is allocated to each symbol. The linker then replaces each instance of the symbol in the object code with the address of the allocated resource instance, thereby generating executable code.
Abstract:
A novel linker statically allocates resource instances of a non-memory resource at link time. In one example, a novel declare instruction in source code declares a pool of resource instances, where the resource instances are instances of the non-memory resource. A novel allocate instruction is then used to instruct the linker to allocate a resource instance from the pool to be associated with a symbol. Thereafter the symbol is usable in the source code to refer to an instance of the non-memory resource. At link time the linker allocates an instance of the non-memory resource to the symbol and then replaces each instance of the symbol with an address of the non-memory resource instance, thereby generating executable code. Examples of instances of non-memory resources include ring circuits and event filter circuits.
Abstract:
A novel allocate instruction and a novel API call are received onto a compiler. The allocate instruction includes a symbol that identifies a non-memory resource instance. The API call is a call to perform an operation on a non-memory resource instance, where the particular instance is indicated by the symbol in the API call. The compiler replaces the API call with a set of API instructions. A linker then allocates a value to be associated with the symbol, where the allocated value is one of a plurality of values, and where each value corresponds to a respective one of the non-memory resource instances. After allocation, the linker generates an amount of executable code, where the API instructions in the code: 1) are for using the allocated value to generate an address of a register in the appropriate non-memory resource instance, and 2) are for accessing the register.