摘要:
A processor capable of fetching and executing variable length instructions is described having instructions of at least two lengths. The processor operates in multiple modes. One of the modes restricts instructions that can be fetched and executed to the longer length instructions. An instruction cache is used for storing variable length instructions and their associated predecode bit fields in an instruction cache line and storing the instruction address and processor operating mode state information at the time of the fetch in a tag line. The processor operating mode state information indicates the program specified mode of operation of the processor. The processor fetches instructions from the instruction cache for execution. As a result of an instruction fetch operation, the instruction cache may selectively enable the writing of predecode bit fields in the instruction cache and may selectively enable the reading of predecode bit fields stored in the instruction cache based on the processor state at the time of the fetch.
摘要:
In one or more embodiments, a processor includes a link return stack circuit used for storing branch return addresses, wherein a link return stack controller is configured to determine that one or more entries in the link return stack are invalid as being dependent on a mispredicted branch, and to reset the link return stack to a valid remaining entry, if any. In this manner, branch mispredictions cause dependent entries in the link return stack to be flushed from the link return stack, or otherwise invalidated, while preserving the remaining valid entries, if any, in the link return stack. In at least one embodiment, a branch information queue used for tracking predicted branches is configured to store a marker indicating whether a predicted branch has an associated entry in the link return stack, and it may store an index value identifying the specific, corresponding entry in the link return stack.
摘要:
A microprocessor includes two branch history tables, and is configured to use a first one of the branch history tables for predicting branch instructions that are hits in a branch target cache, and to use a second one of the branch history tables for predicting branch instructions that are misses in the branch target cache. As such, the first branch history table is configured to have an access speed matched to that of the branch target cache, so that its prediction information is timely available relative to branch target cache hit detection, which may happen early in the microprocessor's instruction pipeline. The second branch history table thus need only be as fast as is required for providing timely prediction information in association with recognizing branch target cache misses as branch instructions, such as at the instruction decode stage(s) of the instruction pipeline.
摘要:
A conditional instruction architected to receive one or more operands as inputs, to output to a target the result of an operation performed on the operands if a condition is satisfied, and to not provide an output if the condition is not satisfied, is executed so that it unconditionally provides an output to the target. The conditional instruction obtains the prior value of the target (that is, the value produced by the most recent instruction preceding the conditional instruction that updated that target). The condition is evaluated. If the condition is satisfied, an operation is performed and the result of the operation output to the target. If the condition is not satisfied, the prior value is output to the target. Subsequent instructions may rely on the target as an operand source (whether written to a register or forwarded to the instruction), prior to the condition evaluation.
摘要:
A Branch Target Address Cache (BTAC) stores at least two branch target addresses in each cache line. The BTAC is indexed by a truncated branch instruction address. An offset obtained from a branch prediction offset table determines which of the branch target addresses is taken as the predicted branch target address. The offset table may be indexed in several ways, including by a branch history, by a hash of a branch history and part of the branch instruction address, by a gshare value, randomly, in a round-robin order, or other methods.
摘要:
A processor provides two-level interrupt servicing. In one embodiment, the processor comprises a storage device and an interrupt handler. The storage device is configured to store an interrupt identifier corresponding to an interrupt request. The interrupt handler is configured to recognize the interrupt request, initiate a common interrupt service routine responsive to recognizing the interrupt request and subsequently initiate an interrupt service routine corresponding to the stored interrupt identifier.
摘要:
A method of managing cache partitions provides a first pointer for higher priority writes and a second pointer for lower priority writes, and uses the first pointer to delimit the lower priority writes. For example, locked writes have greater priority than unlocked writes, and a first pointer may be used for locked writes, and a second pointer may be used for unlocked writes. The first pointer is advanced responsive to making locked writes, and its advancement thus defines a locked region and an unlocked region. The second pointer is advanced responsive to making unlocked writes. The second pointer also is advanced (or retreated) as needed to prevent it from pointing to locations already traversed by the first pointer. Thus, the pointer delimits the unlocked region and allows the locked region to grow at the expense of the unlocked region.
摘要:
A fixed number of variable-length instructions are stored in each line of an instruction cache. The variable-length instructions are aligned along predetermined boundaries. Since the length of each instruction in the line, and hence the span of memory the instructions occupy, is not known, the address of the next following instruction is calculated and stored with the cache line. Ascertaining the instruction boundaries, aligning the instructions, and calculating the next fetch address are performed in a predecoder prior to placing the instructions in the cache.
摘要:
A pipelined processor comprises an instruction cache (iCache), a branch target address cache (BTAC), and processing stages, including a stage to fetch from the iCache and the BTAC. To compensate for the number of cycles needed to fetch a branch target address from the BTAC, the fetch from the BTAC leads the fetch of a branch instruction from the iCache by an amount related to the cycles needed to fetch from the BTAC. Disclosed examples either decrement a write address of the BTAC or increment a fetch address of the BTAC, by an amount essentially corresponding to one less than the cycles needed for a BTAC fetch.
摘要:
A processor includes a conditional branch instruction prediction mechanism that generates weighted branch prediction values. For weakly weighted predictions, which tend to be less accurate than strongly weighted predictions, the power associating with speculatively filling and subsequently flushing the cache is saved by halting instruction prefetching. Instruction fetching continues when the branch condition is evaluated in the pipeline and the actual next address is known. Alternatively, prefetching may continue out of a cache. To avoid displacing good cache data with instructions prefetched based on a mispredicted branch, prefetching may be halted in response to a weakly weighted prediction in the event of a cache miss.