摘要:
The present invention provides a multithreaded processor, such as a network processor, that fetches instructions in a pipeline stage based on feedback signals from later stages. The multithreaded processor comprises a pipeline with an instruction unit in the early stage and an instruction queue, a thread interleaver, and an execution pipeline in the later stages. Feedback signals from the later stages cause the instruction unit to block fetching, immediately fetch, raise priority, or lower priority for a particular thread. The instruction queue generates a queue signal, on a per thread basis, responsive to a thread queue condition, etc., the thread interleaver generates an interleaver signal responsive to a thread condition, etc., and the execution pipeline generates an execution signal responsive to an execution stall, etc.
摘要:
A network processor has numerous novel features including a multi-threaded processor array, a multi-pass processing model, and Global Packet Memory (GPM) with hardware managed packet storage. These unique features allow the network processor to perform high-touch packet processing at high data rates. The packet processor can also be coded using a stack-based high-level programming language, such as C or C++. This allows quicker and higher quality porting of software features into the network processor.Processor performance also does not severely drop off when additional processing features are added. For example, packets can be more intelligently processed by assigning processing elements to different bounded duration arrival processing tasks and variable duration main processing tasks. A recirculation path moves packets between the different arrival and main processing tasks. Other novel hardware features include a hardware architecture that efficiently intermixes co-processor operations with multi-threaded processing operations and improved cache affinity.
摘要:
The present invention provides a network multithreaded processor, such as a network processor, including a thread interleaver that implements fine-grained thread decisions to avoid underutilization of instruction execution resources in spite of large communication latencies. In an upper pipeline, an instruction unit determines an-instruction fetch sequence responsive to an instruction queue depth on a per thread basis. In a lower pipeline, a thread interleaver determines a thread interleave sequence responsive to thread conditions including thread latency conditions. The thread interleaver selects threads using a two-level round robin arbitration. Thread latency signals are active responsive to thread latencies such as thread stalls, cache misses, and interlocks. During the subsequent one or more clock cycles, the thread is ineligible for arbitration. In one embodiment, other thread conditions affect selection decisions such as local priority, global stalls, and late stalls.
摘要:
In a method embodiment (10), the method operates a microprocessor (110), and the microprocessor has an instruction set. The method first (11) stores an identifier code uniquely identifying the particular microprocessor in a one-time programmable register. The method second (12) issues to the microprocessor an identifier request instruction from the instruction set. The method then, and in response to the identifier request instruction, provides (18) from the microprocessor an identifier code. Other circuits, systems, and methods are also disclosed and claimed.
摘要:
A microprocessor of the superscalar pipelined type, having speculative execution capability, is disclosed. Speculative execution is under the control of a fetch unit having a branch target buffer and a return address stack, each having multiple entries. Each entry includes an address value corresponding to the destination of a branching instruction, and an associated register value, such as a stack pointer. Upon the execution of a subroutine call, the return address and current stack pointer value are stored in the return address stack, to allow for fetching and speculative execution of the sequential instructions following the call in the calling program. Any branching instruction, such as the call, return, or conditional branch, will have an entry included in the branch target buffer; upon fetch of the branch on later passes, speculative execution from the target address can begin using the stack pointer value stored speculatively in the branch target buffer in association with the target address.
摘要:
Division and square root calculations are performed using an operand routing circuit (16) for receiving an operand N, and operand D and a seed value S and directing the operands and seed value to a multiplier (38). Single multiplier (38) is configured into two arrays for calculating partial products of N and S and D and S. The results of multiplier (38) are transmitted through switching circuitry (20) or registers (48) (50) either to operand routing circuitry (16) or adder (44) depending on a convergence algorithm. The final result is rounded.
摘要:
Multiple pipelined Translation Look-aside Buffer (TLB) units are configured to compare a translation address with associated TLB entries. The TLB units operated in serial order comparing the translation address with associated TLB entries until an identified one of the TLB units produces a hit. The TLB units following the TLB unit producing the hit might be disabled.
摘要:
A subpipelined translation embodiment provides binary compatibility between current an future generations of DSPs. When retrieved from memory an entire fetch packet is assigned an operating mode (base instruction set or migrant instruction set) according to the current execution mode. The fetch packets from the instruction memory are parsed into execute packets and sorted by execution unit (dispatched) in a datapath shared by both execution modes (base and migrant). The two execution modes have separate control logic. Instructions from the dispatch datapath are decoded by either base architecture decode logic or the migrant architecture decode logic, depending on the execution mode bound to the parent fetch packet. Code processed by the migrant and base decode pipelines produces machine words that are selected by a multiplexer. The multiplexer is controlled by the operating mode bound to the fetch packet that produced the machine word. The selected machine word controls a global register file, which supplies operands to all hardware execution units and accepts results of all hardware execution units.
摘要:
A flip flop (30) comprising a master stage (34) comprising a first plurality of transistors (54, 56), wherein each of the first plurality of transistors comprises a selective conductive path between a source and drain. The flip flop also comprises a slave stage (42) comprising a second plurality of transistors (60, 62, 64, 66), wherein each of the second plurality of transistors comprises a selective conductive path between a source and drain. For the flip flop, in a low power mode the flip flop is operable to receive a first voltage (VDD) coupled to the selective conductive path for each of the first plurality of transistors. Also in the low power mode, the flip flop is operable to receive a second voltage (VDDL) coupled to the selective conductive path for each of the second plurality of transistors. Lastly, the second voltage is greater than the first voltage in the low power mode.
摘要:
A diagnostic program can check the security of a program. The program is stored at predetermined non-relocatable physical address in memory. The diagnostic program is loaded and checks the program at the predetermined physical address against a standard. The diagnostic program then indicates that the program is verified as secure if it meets the standard or non-verified as secure if it does not meet the standard. If the program is not verified as secure, then the diagnostic program may take remedial action such as disabling normal operation of the program, be transmitting a predetermined message via the system modem or downloading another copy of the program via the modem. The program is made non-relocatable using a special table look-aside buffer having a fixed virtual address register and a corresponding fixed physical address register.