摘要:
An apparatus and method for improving microprocessor performance by improving the prediction accuracy of conditional branch instructions is provided. A dynamic branch predictor speculatively updates global branch history information based on the prediction of a first branch instruction so that the predictor can predict the outcome of a second branch instruction following closely in the pipeline with the benefit of the first prediction. This improves the prediction accuracy where the first branch has not been resolved prior to the time when the second prediction is ready to be made. If the first prediction turns out to be incorrect, the global branch history is restored from a previously saved copy and updated with the first branch instruction's actual outcome.
摘要:
An apparatus and method are provided for executing a compare-and-jump operation in a pipeline microprocessor. Typically, the compare-and-jump operation is specified by two micro instructions. The first micro instruction, an ALU micro instruction, directs the microprocessor to perform an ALU operation, resulting in update of a flags register. The second micro instruction, a conditional jump micro instruction, directs the microprocessor to examine the flags register and to branch program control to a target address if a prescribed condition is met. The apparatus has a jump combiner that detects the ALU micro instruction and the conditional jump micro instruction in a micro instruction queue. The jump combiner indicates the prescribed condition for the conditional branch in a field of the ALU micro instruction, and then deletes the conditional jump micro instruction from the queue. The apparatus also has execution logic that performs the ALU operation, generates the result, and updates the flags register. The apparatus also has store logic that receives the generated result and examines the flags register as prescribed by the field of the single ALU micro instruction.
摘要:
A microprocessor instruction translator translates a conditional load instruction into at least two microinstructions. An out-of-order execution pipeline executes the microinstructions. To execute a first microinstruction, an execution unit receives source operands from the source registers of a register file and responsively generates a first result using the source operands. To execute a second the microinstruction, an execution unit receives a previous value of the destination register and the first result and responsively reads data from a memory location specified by the first result and provides a second result that is the data if a condition is satisfied and that is the previous destination register value if not. The previous value of the destination register comprises a result produced by execution of a microinstruction that is the most recent in-order previous writer of the destination register with respect to the second microinstruction.
摘要:
An out-of-order execution microprocessor executes an architectural segment register-loading instruction that instructs the microprocessor to load a new value into an architectural segment register of the microprocessor. A comparator compares the new value specified by the architectural segment register-loading instruction with a current contents of the architectural segment register. A control unit causes to be re-executed using the new value all instructions in the microprocessor that used the current architectural segment register contents as a source operand and that are newer in program order than the architectural segment register-loading instruction whenever the comparator indicates the new value does not equal the current contents. An instruction scheduler retrieves the current contents and issues for execution instructions that use the retrieved current contents, even though the instructions are newer in program order than the register-loading instruction and the register-loading instruction has not yet written the new value to the architectural segment register.
摘要:
A microprocessor instruction translator translates a conditional load instruction into at least two microinstructions. An out-of-order execution pipeline executes the microinstructions. To execute a first microinstruction, an execution unit receives source operands from the source registers of a register file and responsively generates a first result using the source operands. To execute a second the microinstruction, an execution unit receives a previous value of the destination register and the first result and responsively reads data from a memory location specified by the first result and provides a second result that is the data if a condition is satisfied and that is the previous destination register value if not. The previous value of the destination register comprises a result produced by execution of a microinstruction that is the most recent in-order previous writer of the destination register with respect to the second microinstruction.
摘要:
An architectural instruction instructs a microprocessor to perform an operation on first and second source operands to generate a result and to write the result to a destination register only if architectural condition flags satisfy a condition specified in the architectural instruction. A hardware instruction translator translates the architectural instruction into first and second microinstructions. To execute the first microinstruction, an execution pipeline performs the operation on the source operands to generate the result, determines whether the architectural condition flags satisfy the condition, and updates a non-architectural indicator to indicate whether the architectural condition flags satisfy the condition. To execute the first microinstruction, if the non-architectural indicator updated by the first microinstruction indicates the architectural condition flags satisfy the condition, it updates the destination register with the result; otherwise, it updates the destination register with the current value of the destination register.
摘要:
A microprocessor processes a macroinstruction that instructs the microprocessor to write an 8-bit result into only a lower 8 bits of an N-bit architected general purpose register. An instruction translator translates the macroinstruction into a merge microinstruction that specifies an N-bit first source register, an 8-bit second source register, and an N-bit destination register to receive an N-bit result. The N-bit first source register and the N-bit destination register are the N-bit architected general purpose register. An execution unit receives the merge microinstruction and responsively generates the N-bit result to be subsequently written to the N-bit architected general purpose register even though the macroinstruction only instructs the microprocessor to write the 8-bit result into the lower 8 bits of the N-bit architected general purpose register. Specifically, the execution unit directs the 8-bit result into the lower 8 bits of the N-bit result and directs the upper N-8 bits of the N-bit first source register into corresponding upper N-8 bits of the N-bit result.
摘要:
A microprocessor includes a plurality of execution units configured to receive instructions and operands thereof and to execute the instructions. An instruction scheduler issues the instructions to the execution units and selects sources of the instruction operands. At least one of the execution units detects one of the operands of one of the instructions is a denormal operand, generates an indication that the instruction needs to be replayed in response to detecting the denormal operand, and provides the denormal operand to the instruction scheduler in response to detecting the denormal operand, rather than normalizing the denormal operand. The instruction scheduler normalizes the denormal operand, in response to the indication, and causes the normalized operand, rather than the denormal operand, to be provided to the execution unit when the instruction is replayed.
摘要:
A microprocessor for improving out-of-order superscalar execution unit utilization with a relatively small in-order instruction retirement buffer. A plurality of execution units each calculate an instruction result. The instruction is either an excepting type instruction or a non-excepting type instruction. The excepting type instruction is capable of causing the microprocessor to take an exception after being issued to the execution unit, wherein the non-excepting type instruction is incapable of causing the microprocessor to take an exception after being issued. A retire unit makes a determination that an instruction is the oldest instruction in the microprocessor and that the instruction is ready to update the architectural state of the microprocessor with its result. The retire unit makes the determination before the execution unit outputs the result of the non-excepting type instruction, wherein the retire unit makes the determination after the execution unit outputs the result of the excepting type instruction.
摘要:
An apparatus and method for providing early instruction results is disclosed. Early execution logic, comprising an enhanced address generator located in an address generation stage of the microprocessor pipeline, receives input operands and generates early results of instructions reaching the address stage prior to final execution units (in lower pipeline stages) generating final results of the instruction for updating an architected register file. The early execution logic is configured to execute only a subset of the instructions in the microprocessor instruction set. The early results are invalid if the instruction is not in the subset. An early register file corresponding to the architected register file stores the early results and also provides the early results to the early execution logic as input operands. The generated early results are invalid if any input operands are invalid. Early status flags accumulated from the early results enable selective early execution of conditional instructions.