摘要:
A technique recovers return address stack (RAS) content and restores alignment of a RAS top-of-stack (TOS) pointer for occurrences of mispredictions due to speculative operation, out-of-order instruction processing, and exception handling. In at least one embodiment of the invention, an apparatus includes a speculative execution processor pipeline, a first structure for maintaining return addresses relative to instruction flow at a first stage of the pipeline, at least a second structure for maintaining return addresses relative to instruction flow at a second stage of the pipeline. The second stage of the pipeline is deeper in the pipeline than the first stage. The apparatus includes circuitry operable to reproduce at least return addresses from the second structure to the first structure.
摘要:
Embodiments of the present invention provide a system that facilitates executing a memory barrier (membar) instruction in an execute-ahead processor, wherein the membar instruction forces buffered loads and stores to complete before allowing a following instruction to be issued.
摘要:
A technique for coordinating execution of instructions in a processor that allows instructions to execute out-of-order includes decoding a particular instruction that is defined in accordance with an instruction set of the processor. A helper sequence of instructions that corresponds to the particular instruction is then introduced into a stream of executable operations. The corresponding helper sequence includes a first artificial dependency instruction that codes a dependency on a register that is not actually employed as a register source or target for an operation performed by the particular instruction.
摘要:
Inspecting a currently fetched instruction group and determining branching behavior of the currently fetched instruction group, allows for intelligent instruction prefetching. A currently fetched instruction group is predecoded and, assuming the currently fetch instruction group includes a branch type instruction, a branch target is characterized in relation to a fetch boundary, which delimits a memory region contiguous with the memory region that hosts the currently fetched instruction group. Instruction prefetching is included based, at least in part, on the predecoded characterization of the branch target.
摘要:
Concurrently branch predicting for multiple branch-type instructions demands of high performance environments. Concurrently branch predicting for multiple branch-type instructions provides the instruction flow for a high bandwidth pipeline utilized in advanced performance environments. Branch predictions are concurrently generated for multiple branch-type instructions. The concurrently generated branch predictions are then supplied for further processing of the corresponding branch-type instructions.
摘要:
One embodiment of the present invention provides a system that predicts a jump target for a jump instruction. During operation, the system starts fetching the jump instruction while executing a process. Next, the system uses a program counter for the process and uses state information that is specific to the process to look up the jump target for the jump instruction. Finally, the system uses the jump target returned by the lookup as a predicted jump target for the jump instruction.
摘要:
One embodiment of the present invention provides a system that counts speculatively-executed instructions for performance analysis purposes. During operation, the system counts instructions which are normally executed during a normal-execution mode. Next, the system enters a speculative-execution mode wherein instructions are speculatively executed without being committed to the architectural state of the processor. During the speculative-execution mode, the system counts the speculatively-executed instructions in a manner that enables the count of speculatively-executed instructions to be reset if the speculative execution fails.
摘要:
One embodiment of the present invention provides a system that facilitates deferring execution of instructions with unresolved data dependencies as they are issued for execution in program order. During a normal execution mode, the system issues instructions for execution in program order. Upon encountering an unresolved data dependency during execution of an instruction, the system generates a checkpoint that can subsequently be used to return execution of the program to the point of the instruction. Next, the system executes the instruction and subsequent instructions in an execute-ahead mode, wherein instructions that cannot be executed because of an unresolved data dependency are deferred, and wherein other non-deferred instructions are executed in program order. Upon encountering a store during the execute-ahead mode, the system determines if the store buffer is full. If so, the system prefetches a cache line for the store, and defers execution of the store.
摘要:
One embodiment of the present invention provides a system which facilitates eliminating a restart penalty when reissuing deferred instructions in a processor that supports speculative-execution. During a normal execution mode, the system issues instructions for execution in program order, wherein issuing the instructions involves decoding the instructions. Upon encountering an unresolved data dependency during execution of an instruction, the processor performs a checkpointing operation and executes subsequent instructions in an execute-ahead mode, wherein instructions that cannot be executed because of the unresolved data dependency are deferred, and wherein other non-deferred instructions are executed in program order. When an unresolved data dependency is resolved during execute-ahead mode, the processor begins to execute the deferred instructions in a deferred mode. In doing so, the processor initially issues deferred instructions, which have already been decoded, from a deferred queue. Simultaneously, the processor feeds instructions from a deferred SRAM into the decode unit, and these instructions eventually pass into the deferred queue. In this way, at the start of deferred mode, deferred instructions can issue from the deferred queue without having to pass through the decode unit, thereby providing time for deferred instructions from the deferred SRAM to progress through a decode unit in order to read input values for the decoded instruction, but not to be re-decoded.
摘要:
One embodiment of the present invention provides a system that avoids read-after-write (RAW) hazards while speculatively executing instructions on a processor. The system starts in a normal execution mode, wherein the system issues instructions for execution in program order. Upon encountering a stall condition during execution of an instruction, the system generates a checkpoint, and executes the instruction and subsequent instructions in a speculative-execution mode. The system also maintains dependency information for each register indicating whether or not a value in the register depends on an unresolved data-dependency. The system uses this dependency information to avoid RAW hazards during the speculative-execution mode.