摘要:
A method and apparatus for dynamically predicting the outcome and the tar address of a multiple-target branch instruction, where the multiple-target branch instruction contains at least two potential target addresses, not including the fall through address. In addition, this method and apparatus can also be used to predict multiple single-target branches simultaneously. The apparatus stores information indicating the outcome of previous executions and predictions of the multiple-target branch instruction in a branch prediction table. In addition, multiple target addresses (at least two) are associated with the multiple-target branch instruction. Using the information indicating the outcome of the previous execution of the multiple-target branch instruction, the apparatus predicts the outcome of a next execution of the multiple-target branch instruction, and predicts which, if any, of the target addresses associated with the multiple-target branch instruction, will be taken.
摘要:
A branch prediction system is described for use within a microprocessor having an instruction cache capable of storing two or more instructions per cache line. Each entry of a branch prediction table (BPT) includes a value identifying whether at least one other instruction within a common cache line contains a branch. The value is referred to herein as a multiple-B bit value. The multiple-B bit value is examined by branch prediction logic while one branch prediction is being performed to determine whether a second branch prediction can be initiated for another branch within the same cache line. In one implementation, the multiple-B bit of one BPT entry is examined following a hit. A branch prediction for the entry generating a hit is initiated. Simultaneously, the BPT is reaccessed to search for an entry corresponding to another instruction within the same cache line if the multiple-B bit for the first entry was set. If the second entry is found, a secondary branch prediction is initiated. Eventually, the first branch prediction is output. If the first branch prediction is Not Taken, then the second branch prediction is output during the next clock cycle. If the first branch prediction is Taken, then the second branch prediction may be aborted as it is not needed. Method and apparatus embodiments of the invention are described.
摘要:
A processor and method that reduces instruction fetch penalty in the execution of a program sequence of instructions comprises a branch predict instruction that is inserted into the program at a location which precedes the branch. The branch predict instruction has an opcode that specifies a branch as likely to be taken or not taken, and which also specifies a target address of the branch. A block of target instructions, starting at the target address, is prefetched into the instruction cache of the processor so that the instructions are available for execution prior to the point in the program where the branch is encountered. Also specified by the opcode is an indication of the size of the block of target instructions, and a trace vector of a path in the program sequence that leads to the target from the branch predict instruction for better utilization of limited memory bandwidth.
摘要:
Allocation circuitry for allocating entries within a set-associative cache memory is disclosed. The set-associative cache memory comprises N ways, each way having M entries and corresponding entries in each of the N ways constituting a set of entries. The allocation circuitry has a first circuit which identifies related data units by identifying a probability that the related data units may be successively read from the cache memory. A second circuit within the allocation circuitry allocates the corresponding entries in each of the ways to the related data units, so that related data units are stored in a common set of entries. Accordingly, the related data units will be simultaneously outputted from the set-associative cache memory, and are thus concurrently available for processing. The invention may find application in allocating entries of a common set in a branch prediction table (BPT) to branch prediction information for related branch instructions.
摘要:
A loop branch prediction system is provided to predict a final iteration of a loop and resteer an associated fetch module to an appropriate target address. The loop prediction system includes a counter and an end of loop (EOL) module. In one mode, the counter tracks loop branches in process. When a termination condition is detected, the counter switches to a second mode to track the number of loop branches still to be issued. The EOL module compares the number of loop branches still to be issued with one or more threshold values and generates a resteer signal when a match is detected.
摘要:
A branch operation is processed using a branch predict instruction and an associated branch instruction. The branch predict instruction indicates a predicted direction, a target address, and an instruction address for the associated branch instruction. When the branch predict instruction is detected, the target address is stored at an entry indicated by the associated branch instruction address and a prefetch request is triggered to the target address. The branch predict instruction may also include hint information for managing the storage and use of the branch prediction information.
摘要:
A processor architecture with an instruction set having a predict instruction, the predict instruction providing static prediction information and a statically predicted target address to the processor for a branch instruction. The processor decodes a predict instruction to obtain an associated pair of addresses comprising a predicted target address and a referenced instruction address, and fetches a predicted target instruction having an instruction address matching the predicted target address when a fetched and decoded branch instruction has an instruction address matching the referenced instruction address.
摘要:
A branch operation is processed using a branch predict instruction and an associated branch instruction. The branch predict instruction indicates a predicted direction, a target address, and an instruction address for the associated branch instruction. When the branch predict instruction is detected, the target address is stored at an entry indicated by the associated branch instruction address and a prefetch request is triggered to the target address. The branch predict instruction may also include hint information for managing the storage and use of the branch prediction information.
摘要:
A processor that includes an execution pipeline that executes a programmed flow of instructions is provided. The processor also includes an instruction pointer generator configured to generate an instruction pointer. Furthermore, the processor includes a branch prediction circuit configured to receive the instruction pointer. In response to the instruction pointer, the branch prediction circuit is configured to determine if an instruction corresponding to the instruction pointer includes a branch that is predicted taken and if so to provide to said execution pipeline a target instruction corresponding to said instruction. The branch prediction circuit provides to the execution pipeline at least one target instruction corresponding to the instruction corresponding to the instruction pointer.
摘要:
A method and apparatus for generating respective branch predictions for first and second branch instructions, both indexed by a first instruction pointer, is disclosed. The apparatus includes dynamic branch prediction circuitry for generating a branch prediction based on the outcome of previous branch resolution activity, as well as static branch prediction circuitry configured to generate a branch prediction based on static branch prediction information. Prediction output circuitry, coupled to the both the dynamic and static branch prediction circuitry, outputs the respective branch predictions for the first and second branch instructions in first and second clock cycles to an instruction buffer (or "rotator"). Specifically, the prediction output control circuitry outputs the branch prediction for the second branch instruction in the second clock cycle and in response to the initiation of a recycle stall during the first clock cycle. The receipt of the predictions and their corresponding branch instructions at the instruction buffer is thus synchronized, and the opportunity of generating a prediction for the second branch instruction is not lost.