Abstract:
A microprocessor architecture including a finite state machine in combination with a microcode instruction cache for executing microinstructions. Microinstructions which would normally result in small sequences of high-repetition looped operations are implemented in a finite state machine (FSM). The use of the FSM is more energy-efficient than looping instructions in a cache or register set. In addition, the flexibility of a cache, or other memory oriented approach, in executing microcode instructions is still available. A microinstruction is identified as an FSM operation (as opposed to a cache operation) by an ID tag. Other fields of the microinstruction can be used to identify the type of FSM circuitry to use, direct configuration of a FSM to implement the microinstruction, indicate that certain fields are to be implemented in one or more FSMs and/or in memory-oriented operations such as in a cache or register.
Abstract:
A hardware task manager for managing operations in an adaptive computing system. The task manager indicates when input and output buffer resources are sufficient to allow a task to execute. The task can require an arbitrary number of input values from one or more other (or the same) tasks. Likewise, a number of output buffers must also be available before the task can start to execute and store results in the output buffers. The hardware task manager maintains a counter in association with each input and output buffer. For input buffers, a negative value for the counter means that there is no data in the buffer and, hence, the respective input buffer is not ready or available. Thus, the associated task can not run. Predetermined numbers of bytes, or nullunits,null are stored into the input buffer and an associated counter is incremented. When the counter value transitions from a negative value to a zero the high-order bit of the counter is cleared, thereby indicating the input buffer has sufficient data and is available to be processed by a task.
Abstract:
A reconfigurable filter node including an input data memory adapted to store a plurality of input data values, a filter coefficient memory adapted to store a plurality of filter coefficient values, and a plurality of computational units adapted to simultaneously compute filter data values. Filter data values are the outputs of a filter in response to input data values or a second plurality of filter coefficients to be used in subsequent filter data value computations. First and second input data registers load successive input data values input data memory or from adjacent computational units. Each computational unit comprises a pre-adder adapted to output either the sum two input data values stored in the computational unit or alternately to output a single input data value, and a multiply-and-accumulate unit adapted to multiply the output of the pre-adder by a filter coefficient and accumulate the result.
Abstract:
A computational unit, or node, in an adaptable computing system is described. A preferred embodiment of the node allows the node to be adapted for use for any of ten types of functionality by using a combination of execution units with a configurable interconnection scheme. Functionality types include the following: Asymmetric FIR Filter, Symmetric FIR Filter, Complex Multiply/FIR Filter, Sum-of-absolute-differences, Bi-linear Interpolation, Biquad IIR Filter, Radix-2 FFT/IFFT, Radix-2 DCT/IDCT, Golay Correlator, Local Oscillator/Mixer.
Abstract:
A computational unit, or node, in an adaptive computing engine uses a uniform interface to a network to communicate with other nodes and resources. The uniform interface is referred to as a nullnode wrapper.null The node wrapper includes a hardware task manager (HTM), a data distributor, optional direct memory access (DMA) engine and a data aggregator. The hardware task manager indicates when input and output buffer resources are sufficient to allow a task to execute. The HTM coordinates a nodes assigned tasks using a task lists. A nullready-to-run queuenull is implemented as a first-in first-out stack. The HTM uses a top-level finite-state machine (FSM) that communicates with a number of subordinate FSMs to control individual HTM components. The Data Distributor interfaces between the node's input pipeline register and various memories and registers within the node. Different types of data distribution are possible based upon the values in service and auxiliary fields of a 50-bit control structure. The Data Aggregator arbitrates among up to four node elements that request access to the node's output pipeline register for the purpose of transferring data to the intended destination via the network. The DMA Engine uses a five-register model. The registers include a Starting Address Register, an Address Stride Register, a Transfer Count Register, a Duty Cycle Register, and a Control Register including a GO bit, Target Node number/port number, and DONE protocol. A control node, or nullK-node,null is used to control various aspects of the HTM, data distributor, data aggregator and DMA operations within the nodes of the system.