摘要:
Mechanisms, in a data processing system, are provided for tracking effective addresses through a processor pipeline of the data processing system. The mechanisms comprise logic for fetching an instruction from an instruction cache and associating, by an effective address table logic in the data processing system, an entry in an effective address table (EAT) data structure with the fetched instruction. The mechanisms further comprise logic for associating an effective address tag (eatag) with the fetched instruction, the eatag comprising a base eatag that points to the entry in the EAT and an eatag offset. Moreover, the mechanisms comprise logic for processing the instruction through the processor pipeline by processing the eatag.
摘要:
Mechanisms, in a data processing system, are provided for tracking effective addresses through a processor pipeline of the data processing system. The mechanisms comprise logic for fetching an instruction from an instruction cache and associating, by an effective address table logic in the data processing system, an entry in an effective address table (EAT) data structure with the fetched instruction. The mechanisms further comprise logic for associating an effective address tag (eatag) with the fetched instruction, the eatag comprising a base eatag that points to the entry in the EAT and an eatag offset. Moreover, the mechanisms comprise logic for processing the instruction through the processor pipeline by processing the eatag.
摘要:
Mechanisms, in a processor, are provided for detecting and handling short forward branch conversion candidates. The mechanisms identify a conditional branch in the computer code and determine if the short forward conditional branch is to be converted to a non-branching conditional sequence of instructions. Moreover, the mechanisms convert the conditional branch to a non-branching conditional sequence of instructions comprising a resolve instruction and one or more conditional instructions dependent on the resolve instruction. In addition, the mechanisms execute the non-branching conditional sequence of instructions in place of the conditional branch in the computer code and generate an output of the computer code based on the execution of the non-branching conditional sequence of instructions.
摘要:
A method and apparatus for dynamically managing instruction buffer depths for non-predicted branches reduces wasted energy and resources associated with low confidence branch prediction conditions. A portion of the instruction buffer for a instruction thread is allocated for storing predicted branch instruction streams and another portion, which may be zero-sized during high prediction confidence conditions, is allocated to the non-predicted branch instruction stream. The size of the buffers is adjusted dynamically in conformity with an on-going prediction confidence that provides a measure of how well branch prediction mechanisms are working for a given instruction thread. An alternate instruction fetch address table can be maintained and multiplexed with the main fetch address register for addressing the instruction cache, so that the instruction stream can be quickly shifted to the non-predicted path when a branch instruction is resolved to the non-predicted path.
摘要:
A method and apparatus for dynamically managing instruction buffer depths for non-predicted branches reduces wasted energy and resources associated with low confidence branch prediction conditions. A portion of the instruction buffer for a instruction thread is allocated for storing predicted branch instruction streams and another portion, which may be zero-sized during high prediction confidence conditions, is allocated to the non-predicted branch instruction stream. The size of the buffers is adjusted dynamically in conformity with an on-going prediction confidence that provides a measure of how well branch prediction mechanisms are working for a given instruction thread. An alternate instruction fetch address table can be maintained and multiplexed with the main fetch address register for addressing the instruction cache, so that the instruction stream can be quickly shifted to the non-predicted path when a branch instruction is resolved to the non-predicted path.
摘要:
Mechanisms are provided for partial flush handling with multiple branches per instruction group. The instruction fetch unit sorts instructions into groups. A group may include a floating branch instruction and a boundary branch instruction. For each group of instructions, the instruction sequencing unit creates an entry in a global completion table (GCT), which may also be referred to herein as a group completion table. The instruction sequencing unit uses the GCT to manage completion of instructions within each outstanding group. Because each group may include up to two branches, the instruction sequencing unit may dispatch instructions beyond the first branch, i.e. the floating branch. Therefore, if the floating branch results in a misprediction, the processor performs a partial flush of that group, as well as a flush of every group younger than that group.
摘要:
A mechanism is provided for thread completion arbitration. The mechanism comprises executing more than two threads of instructions simultaneously in the processor, selecting a first thread from a first subset of threads, in the more than two threads, for completion of execution within the processor, and selecting a second thread from a second subset of threads, in the more than two threads, for completion of execution within the processor. The mechanism further comprises completing execution of the first and second threads by committing results of the execution of the first and second threads to a storage device associated with the processor. At least one of the first subset of threads or the second subset of threads comprise two or more threads from the more than two threads. The first subset of threads and second subset of threads have different threads from one another.
摘要:
A mechanism is provided for thread completion arbitration. The mechanism comprises executing more than two threads of instructions simultaneously in the processor, selecting a first thread from a first subset of threads, in the more than two threads, for completion of execution within the processor, and selecting a second thread from a second subset of threads, in the more than two threads, for completion of execution within the processor. The mechanism further comprises completing execution of the first and second threads by committing results of the execution of the first and second threads to a storage device associated with the processor. At least one of the first subset of threads or the second subset of threads comprise two or more threads from the more than two threads. The first subset of threads and second subset of threads have different threads from one another.
摘要:
Mechanisms are provided for partial flush handling with multiple branches per instruction group. The instruction fetch unit sorts instructions into groups. A group may include a floating branch instruction and a boundary branch instruction. For each group of instructions, the instruction sequencing unit creates an entry in a global completion table (GCT), which may also be referred to herein as a group completion table. The instruction sequencing unit uses the GCT to manage completion of instructions within each outstanding group. Because each group may include up to two branches, the instruction sequencing unit may dispatch instructions beyond the first branch, i.e. the floating branch. Therefore, if the floating branch results in a misprediction, the processor performs a partial flush of that group, as well as a flush of every group younger than that group.
摘要:
A processor includes an execution unit and instruction sequencing logic that fetches instructions for execution. The instruction sequencing logic includes a branch target address cache having a branch target buffer containing a plurality of entries each associating at least a portion of a branch instruction address with a predicted branch target address. The branch target address cache accesses the branch target buffer using a branch instruction address to obtain a predicted branch target address for use as an instruction fetch address. The branch target address cache also includes a filter buffer that buffers one or more candidate branch target address predictions. The filter buffer associates a respective confidence indication indicative of predictive accuracy with each candidate branch target address prediction. The branch target address cache promotes candidate branch target address predictions from the filter buffer to the branch target buffer based upon their respective confidence indications.