摘要:
A pointer is for pointing to a next-to-read location within a stack of information. For pushing information onto the stack: a value is saved of the pointer, which points to a first location within the stack as being the next-to-read location; the pointer is updated so that it points to a second location within the stack as being the next-to-read location; and the information is written for storage at the second location. For popping the information from the stack: in response to the pointer, the information is read from the second location as the next-to-read location; and the pointer is restored to equal the saved value so that it points to the first location as being the next-to-read location.
摘要:
A pointer is for pointing to a next-to-read location within a stack of information. For pushing information onto the stack: a value is saved of the pointer, which points to a first location within the stack as being the next-to-read location; the pointer is updated so that it points to a second location within the stack as being the next-to-read location; and the information is written for storage at the second location. For popping the information from the stack: in response to the pointer, the information is read from the second location as the next-to-read location; and the pointer is restored to equal the saved value so that it points to the first location as being the next-to-read location.
摘要:
A technique for reducing store-hit-loads in an out-of-order processor includes storing a store address of a store instruction associated with a store-hit-load (SHL) pipeline flush in an SHL entry. In response to detecting another SHL pipeline flush for the store address, a current count associated with the SHL entry is updated. In response to the current count associated with the SHL entry reaching a first terminal count, a dependency for the store instruction is created such that execution of a younger load instruction with a load address that overlaps the store address stalls until the store instruction executes.
摘要:
A technique for reducing store-hit-loads in an out-of-order processor includes storing a store address of a store instruction associated with a store-hit-load (SHL) pipeline flush in an SHL entry. In response to detecting another SHL pipeline flush for the store address, a current count associated with the SHL entry is updated. In response to the current count associated with the SHL entry reaching a first terminal count, a dependency for the store instruction is created such that execution of a younger load instruction with a load address that overlaps the store address stalls until the store instruction executes.
摘要:
A method, data processing system, and computer program product for obtaining information about instructions. Instructions are processed. In response to processing a branch instruction in the instructions, a determination is made as to whether a result from processing the branch instruction follows a prediction of whether a branch is predicted to occur for the branch instruction. In response to the result following the prediction, the branch instruction is added to a current segment in a trace. In response to an absence of the result following the prediction, the branch instruction is added to the current segment in the trace and a first new segment and a second new segment are created. The first new segment includes a first branch instruction reached in the instructions from following the prediction. The second new segment includes a second branch instruction in the instructions reached from not following the prediction.
摘要:
The invention relates to a method and apparatus for controlling the instruction flow in a computer system and more particularly to the predicting of outcome of branch instructions using branch prediction arrays, such as BHTs. In an embodiment, the invention allows concurrent BHT read and write accesses without the need for a multi-ported BHT design, while still providing comparable performance to that of a multi-ported BHT design.
摘要:
Hardware-assisted program tracing is facilitated by a processor that includes a root instruction address register, a program trace signature computation unit and a call signature register. When a program instruction having an address matching the root instruction address register is executed, a program trace signature is captured in the call signature register and capture of branch history is commenced. By accumulating different values of the call signature register, for example in response to an interrupt generated when the root instruction is executed, software that performs program tracing can obtain signatures of all of the multiple execution paths that lead to the root instruction, which is also specified by software in order to set different root instructions for program tracing. In an alternative implementation, a storage for multiple call signatures is provided in the processor and read at once by the software.
摘要:
In at least one embodiment, a processor includes at least one execution unit and instruction sequencing logic that fetches instructions for execution by the execution unit. The instruction sequencing logic includes branch logic that outputs predicted branch target addresses for use as instruction fetch addresses. The branch logic includes a branch target address cache (BTAC) having at least one direct entry providing storage for a direct branch target address prediction associating a first instruction fetch address with a branch target address to be used as a second instruction fetch address immediately after the first instruction fetch address and at least one indirect entry providing storage for an indirect branch target address prediction associating a third instruction fetch address with a branch target address to be used as a fourth instruction fetch address subsequent to both the third instruction fetch address and an intervening fifth instruction fetch address.
摘要:
A system and method for optimizing the branch logic of a processor to improve handling of hard to predict indirect branches are provided. The system and method leverage the observation that there will generally be only one move to the count register (mtctr) instruction that will be executed while a branch on count register (bcctr) instruction has been fetched and not executed. With the mechanisms of the illustrative embodiments, fetch logic detects that it has encountered a bcctr instruction that is hard to predict and, in response to this detection, blocks the target fetch from entering the instruction buffer of the processor. At this point, the fetch logic has fetched all the instructions up to and including the bcctr instruction but no target instructions. When the next mtctr instruction is executed, the branch logic of the processor grabs the data and starts fetching using that target address. Since there are no other target instructions that were fetched, no flush is needed if that target address is the correct address, i.e. the branch prediction is correct.
摘要:
A method and system for permitting single cycle instruction dispatch in a superscalar processor system which dispatches multiple instructions simultaneously to a group of execution units for execution and placement of results thereof within specified general purpose registers. Each instruction generally includes at least one source operand and one destination operand. A plurality of intermediate storage buffers are provided and each time an instruction is dispatched to an available execution unit, a particular one of the intermediate storage buffers is assigned to any destination operand within the dispatched instruction, permitting the instruction to be dispatched within a single cycle by eliminating any requirement for determining and selecting the specified general purpose register or a designated alternate general purpose register.