摘要:
Hazard detection is simplified by converting a conditional instruction, operative to perform an operation if a condition is satisfied, into an emissary instruction operative to evaluate the condition and an unconditional base instruction operative to perform the operation. The emissary instruction is executed, while the base instruction is halted. The emissary instruction evaluates the condition and reports the condition evaluation back to the base instruction. Based on the condition evaluation, the base instruction is either launched into the pipeline for execution, or it is discarded (or a NOP, or null instruction, substituted for it). In either case, the dependencies of following instructions may be resolved.
摘要:
A processor pipeline is segmented into an upper portion—prior to instructions going out of program order—and one or more lower portions beyond the upper portion. The upper pipeline is flushed upon detecting that a branch instruction was mispredicted, minimizing the delay in fetching of instructions from the correct branch target address. The lower pipelines may continue execution until the mispredicted branch instruction confirms, at which time all uncommitted instructions are flushed from the lower pipelines. Existing exception pipeline flushing mechanisms may be utilized, by adding a mispredicted branch identifier, reducing the complexity and hardware cost of flushing the lower pipelines.
摘要:
A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
摘要:
In one or more embodiments, a processor includes one or more circuits to flush instructions from an instruction pipeline on a selective basis responsive to detecting a branch misprediction, such that those instructions marked as being dependent on the branch instruction associated with the branch misprediction are flushed. Thus, the one or more circuits may be configured to mark instructions fetched into the processor's instruction pipeline(s) to indicate their branch prediction dependencies, directly or indirectly detect incorrect branch predictions, and directly or indirectly flush instructions in the instruction pipeline(s) that are marked as being dependent on an incorrect branch prediction.
摘要:
A processor pipeline is segmented into an upper portion—prior to instructions going out of program order—and one or more lower portions beyond the upper portion. The upper pipeline is flushed upon detecting that a branch instruction was mispredicted, minimizing the delay in fetching of instructions from the correct branch target address. The lower pipelines may continue execution until the mispredicted branch instruction confirms, at which time all uncommitted instructions are flushed from the lower pipelines. Existing exception pipeline flushing mechanisms may be utilized, by adding a mispredicted branch identifier, reducing the complexity and hardware cost of flushing the lower pipelines.
摘要:
A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.
摘要:
When a branch misprediction in a pipelined processor is discovered, if the mispredicted branch instruction is not the last uncommitted instruction in the pipelines, older uncommitted instructions are checked for dependency on a long latency operation. If one is discovered, all uncommitted instructions are flushed from the pipelines without waiting for the dependency to be resolved. The branch prediction is corrected, and the branch instruction and all flushed instructions older than the branch instruction are re-fetched and executed.
摘要:
A register file is disclosed. The register file includes a plurality of registers and a decoder. The decoder may be configured to receive an address for any one of the registers, and disable a read operation to the addressed register if data in the addressed register is invalid.
摘要:
Delays due to waiting for operands that will not be used by a select operand instruction, are alleviated based on an early recognition that such operand data is not required in order to complete the processing of the select operand instruction. At appropriate points prior to execution, determinations are made regarding a selection criterion or criteria specified by the select operand instruction, conditions that affect the selection criteria, and the availability of operands. A hold circuit uses the determinations to control the activation and release of a hold signal that controls processor pipeline stalls. A stall required to wait for operand data is skipped or a stall is terminated early, if the selected operand is available even though the other operand, that will not be used, is not available. A stall due to waiting for operands is maintained until the selection criteria is met and the selected operand is fetched and made available.
摘要:
A processor architecture to qualify software target-branch hints with hardware-based predictions, the processor including a branch target address cache having entries, where an entry includes a tag field to store an instruction address, a target field to store a target address, and a state field to store a state value. Upon decoding an indirect branch instruction, the processor determines whether an entry in the branch target address cache has an instruction address that matches the address of the decoded indirect branch instruction; and if there is a match, depending upon the state value stored in the entry, the processor will use the stored target address as the predicted target address for the decoded indirect branch instruction, or will use a software provided target address hint if available.