Abstract:
Delays due to waiting for operands that will not be used by a select operand instruction, are alleviated based on an early recognition that such operand data is not required in order to complete the processing of the select operand instruction. At appropriate points prior to execution, determinations are made regarding a selection criterion or criteria specified by the select operand instruction, conditions that affect the selection criteria, and the availability of operands. A hold circuit uses the determinations to control the activation and release of a hold signal that controls processor pipeline stalls. A stall required to wait for operand data is skipped or a stall is terminated early, if the selected operand is available even though the other operand, that will not be used, is not available. A stall due to waiting for operands is maintained until the selection criteria is met and the selected operand is fetched and made available.
Abstract:
A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
Abstract:
A processor pipeline is segmented into an upper portion - prior to instructions going out of program order - and one or more lower portions beyond the upper portion. The upper pipeline is flushed upon detecting that a branch instruction was mispredicted, minimizing the delay in fetching of instructions from the correct branch target address. The lower pipelines may continue execution until the mispredicted branch instruction confirms, at which time all uncommitted instructions are flushed from the lower pipelines. Existing exception pipeline flushing mechanisms may be utilized, by adding a mispredicted branch identifier, reducing the complexity and hardware cost of flushing the lower pipelines.
Abstract:
Hazard detection is simplified by converting a conditional instruction, operative to perform an operation if a condition is satisfied, into an emissary instruction operative to evaluate the condition and an unconditional base instruction operative to perform the operation. The emissary instruction is executed, while the base instruction is halted. The emissary instruction evaluates the condition and reports the condition evaluation back to the base instruction. Based on the condition evaluation, the base instruction is either launched into the pipeline for execution, or it is discarded (or a NOP, or null instruction, substituted for it). In either case, the dependencies of following instructions may be resolved.
Abstract:
A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
Abstract:
Write-through-read (WTR) comparator circuits and related WTR processes and memory systems are disclosed. The WTR comparator circuits can be configured to perform WTR functions for a multiple port file having one or more read and write ports. One or more WTR comparators in the WTR comparator circuit are configured to compare a read index into a file with a write index corresponding to a write-back stage selected write port among a plurality of write ports that can write data to the entry in the file. The WTR comparators then generate a WTR comparator output indicating whether the write index matches the read index to control a WTR function. In this manner, the WTR comparator circuit can employ less WTR comparators than the number of read and write port combinations. Providing less WTR comparators can reduce power consumption, cost, and area required on a semiconductor die for the WTR comparator circuit.
Abstract:
A system, apparatus and method for efficiently processing interrupts using general purpose registers in a pipelined processor. In accordance with the present disclosure, a register file may be updated to efficiently save an interrupt return address. When an interrupt request is received by the system's processor, or when the request is issued in the execution of a program, a pseudo-instruction is generated. This pseudo-instruction travels down the pipeline in the same way as other instructions and updates the register file by causing the register file to be written with the return address of the last instruction for which processing has not been completed.
Abstract:
In one or more embodiments, a processor includes one or more circuits to flush instructions from an instruction pipeline on a selective basis responsive to detecting a branch misprediction, such that those instructions marked as being dependent on the branch instruction associated with the branch misprediction are flushed. Thus, the one or more circuits may be configured to mark instructions fetched into the processor's instruction pipeline(s) to indicate their branch prediction dependencies, directly or indirectly detect incorrect branch predictions, and directly or indirectly flush instructions in the instruction pipeline(s) that are marked as being dependent on an incorrect branch prediction.
Abstract:
A register file is disclosed. The register file includes a plurality of registers and a decoder. The decoder may be configured to receive an address for any one of the registers, and disable a read operation to the addressed register if data in the addressed register is invalid.
Abstract:
A processor having a unidirectional rotator configured to shift or rotate data in one direction is disclosed. The processor also includes a control unit having logic configured to modify a shift value specified by a registered-based shift, or rotate, instruction in an opposite direction, the modified shift value being usable by the rotator to shift, or rotate, the data in the one direction, and thereby, generate the same result as if the data in the rotator had otherwise been shifted, or rotated, in the opposite direction by the shift value originally specified by the registered-based instruction. The control unit is further configured to bypass the logic and provide to the rotator a shift value specified by a register-based instruction to shift, or rotate, the data in the one direction.