摘要:
A CPU having a cluster VLIW architecture is shown which operates in both a high instruction level parallelism (ILP) mode and a low ILP mode. In high ILP mode, the CPU executes wide instruction words using all operational clusters of the CPU and all of a main instruction cache and main data cache of the CPU are accessible to a high ILP task. The CPU also includes a mini-instruction cache, a mini-instruction register and a mini-data cache which are inactive during high ILP mode. An instruction level controller in the CPU receives a low ILP signal, such as an interrupt or function call to a low ILP routine, and switches to low ILP mode. In low ILP mode, the main instruction cache and main data cache are deactivated to preserve their contents. At the same time, a predetermined cluster remains active while the remaining clusters are also deactivated. The low ILP task executes instructions from the mini-instruction cache which are input to the predetermined cluster through the mini-instruction register. The mini-data cache stores operands for the low ILP task. The separate mini-instruction cache and mini-data cache along with the use of only the predetermined cluster minimizes the pollution of the main instruction and data caches, as well as pollution of register files in the deactivated clusters, with regard to a task executing in high ILP mode.
摘要:
An optimum ratio relating the characteristic dimensions of the MOS pull-up and the MOS pull-down devices of a logic circuit with one or more bipolar devices connected to an output of the circuit. This optimum ratio substantially minimizes the propagation delay of the circuit. The first preferred embodiment is shown in a BiNMOS circuit, and the second preferred embodiment is shown in a BiCMOS circuit.
摘要:
A register file organization for a pipelined microprocessor is shown which includes a pipestage register interposed a global bit line and a register cell array of the register file in order to separate the delay associated with driving the global bit line, and devices attached to the global bit line, into a separate pipestage. Another register file organization is shown which includes a pipestage register that is interposed a register cell array and a decoder, which selects a register in the register cell array responsive to an instruction in an instruction register, to separate the decoder function and register cell array access times into different pipestages. The two approaches can be combined to separate the delay associated with the decoder, register cell array and global bit line into different pipestages in order to reduce the pipestage cycle time toward a fundamental minimum for pipelined computer architecture.
摘要:
A high frequency circuit using output drivers with tri-state sections. The plurality of output drivers are connected to an output transmission line. Each driver has a pull-up section, a pull-down section and a tri-state section. Each tri-state section has a low impedance and a high impedance state. Its low impedance state serves to match the impedance of the output transmission line. Its high impedance state isolates its driver from the output transmission line.
摘要:
An apparatus and method are shown for decoding variable length instructions in a processor where a line of variable length instructions from an instruction cache are loaded into an instruction buffer and the start bits indicating the instruction boundaries of the instructions in the line of variable length instructions is loaded into a start bit buffer. A first shift register is loaded with the start bits and shifted in response to a lower program count value which is also used to shift the instruction buffer. A length of a current instruction is obtained by detecting the position of the next instruction boundary in the start bits in the first register. The length of the current instruction is added to the current value of the lower program count value in order to obtain a next sequential value for the lower program count which is loaded into a lower program count register. An upper program count value is determined by loading a second shift register with the start bits, shifting the start bits in response to the lower program count value and detecting when only one instruction remains in the instruction buffer. When one instruction remains, the upper program count value is incremented and loaded into an upper program count register for output to the instruction cache in order to cause a fetch of another line of instructions and a `0` value is loaded into the lower program count register. Another embodiment of the present invention includes multiplexors for loading a branch address into the upper and lower program count registers in response to a branch control signal.