Abstract:
A superscalar microprocessor is provided with a reorder buffer for storing the speculative state of the microprocessor and a register file for storing the real state of the microprocessor. A flags register stores the real state of flags that are updated by flag modifying instructions which are executed by the functional units of the microprocessor. To enhance the performance of the microprocessor with respect to conditional branching instructions, the reorder buffer includes a flag storage area for storing flags that are updated by flag modifying instructions. The flags are renamed to make possible the earlier execution of branch instructions which depend on flag modifying instructions. If a flag is not yet determined, then a flag tag is associated with the flag storage area in place of that flag until the actual flag value is determined. A flag operand bus and a flag tag bus are provided between the flag storage area and the branching functional unit so that the requested flag or flag tags are provided to instructions which are executed in the branching functional unit.
Abstract:
A processor which includes a fetch program counter circuit and an execute program counter circuit is disclosed. The fetch program counter circuit provides less significant program counter value bits in addition to a fetch program counter value. The execute program counter circuit generates an execute program counter value using the less significant program counter value bits. The execute program counter circuit receives a plurality of less significant program counter bit values and selects a single less significant program counter bit value thus generating execute program counter values in a multiple pipeline processor.
Abstract:
A superscalar microprocessor is provided with a reorder buffer for storing the speculative state of the microprocessor and a register file for storing the real state of the microprocessor. A flags register stores the real state of flags that are updated by flag modifying instructions which are executed by the functional units of the microprocessor. To enhance the performance of the microprocessor with respect to conditional branching instructions, the reorder buffer includes a flag storage area for storing flags that are updated by flag modifying instructions. The flags are renamed to make possible the earlier execution of branch instructions which depend on flag modifying instructions. If a flag is not yet determined, then a flag tag is associated with the flag storage area in place of that flag until the actual flag value is determined. A flag operand bus and a flag tag bus are provided between the flag storage area and the branching functional unit so that the requested flag or flag tags are provided to instructions which are executed in the branching functional unit.
Abstract:
In a microprocessor system, a program counter circuit generates a program counter value that represents a retrieved instruction and that includes a more significant portion, a less significant portion, and a carry signal for use in determining a next program counter value. An execute program counter circuit generates an execute program counter value from the less significant program counter value and from the carry signal. The execute program counter value represents a program counter value of an executed instruction.
Abstract:
A processor which includes a fetch program counter circuit and an execute program counter circuit is disclosed. The fetch program counter circuit provides less significant program counter value bits in addition to a fetch program counter value. The execute program counter circuit generates an execute program counter value using the less significant program counter value bits. The execute program counter circuit receives a plurality of less significant program counter bit values and selects a single less significant program counter bit value thus generating execute program counter values in a multiple pipeline processor.
Abstract:
A processor which includes a fetch program counter circuit and an execute program counter circuit is disclosed. The fetch program counter circuit provides less significant program counter value bits in addition to a fetch program counter value. The execute program counter circuit generates an execute program counter value using the less significant program counter value bits. The execute program counter circuit receives a plurality of less significant program counter bit values and selects a single less significant program counter bit value thus generating execute program counter values in a multiple pipeline processor.
Abstract:
In a processor (110) that performs multiple instructions in a single cycle, predicts outcomes of branch conditions and speculatively executes instructions based on the branch predictions, a method and apparatus for operating a data stack utilize a remap array (674) to support a stack exchange capability. The remap array is used to correlate a stack pointer (672) to data elements (700) within the stack. A lookahead stack pointer (502) and remap array (504) are updated to preserve the processor's state of operation while speculative instructions are executed.
Abstract:
A pipelined or superscalar processor (10) that executes operations utilizing operand data of variable bit widths improves parallel performance by partitioning a fixed bit width operand (200) into several partial operand fields (215, 216 and 217), and checking for data dependencies, tagging and forwarding data in these fields independently of one another. An instruction decoder (18) concurrently dispatches multiple ROPs to various functional units (20, 21, 22 and 80). Conflicts which arise with respect to register resources are resolved through register renaming. However, implementation of register renaming is difficult when register structures are overlapping. The present invention supports independent dependency checking, tagging and forwarding of partial bit fields of a register operand which, in combination, allow renaming of registers. Therefore, the variable width register operand structure greatly assists the processor to resolve data dependencies. Operands are tagged by a reorder buffer (26) and supplied with data when it becomes available without regard for the type of data. This method of dependency resolution supports parallel performance of operations and provides a substantial improvement in overall speed of processing. Thus, the processor promotes parallel processing of operations that act upon overlapping data structures which otherwise resist parallel handling.
Abstract:
A pipelined or superscalar processor (10) that executes operations utilizing operand data of variable bit widths improves parallel performance by partitioning a fixed bit width operand (200) into several partial operand fields (215, 216 and 217), and checking for data dependencies, tagging and forwarding data in these fields independently of one another. An instruction decoder (18) concurrently dispatches multiple ROPs to various functional units (20, 21, 22 and 80). Conflicts which arise with respect to register resources are resolved through register renaming. However, implementation of register renaming is difficult when register structures are overlapping. The present invention supports independent dependency checking, tagging and forwarding of partial bit fields of a register operand which, in combination, allow renaming of registers. Therefore, the variable width register operand structure greatly assists the processor to resolve data dependencies. Operands are tagged by a reorder buffer (26) and supplied with data when it becomes available without regard for the type of data. This method of dependency resolution supports parallel performance of operations and provides a substantial improvement in overall speed of processing. Thus, the processor promotes parallel processing of operations that act upon overlapping data structures which otherwise resist parallel handling.
Abstract:
A processor core for supporting the concurrent execution of mixed integer and floating point operations includes integer functional units (110) utilizing 32-bit operand data and a floating point functional unit (22) utilizing up to 82-bit operand data. Eight operand busses (30, 31) connect to the functional units to furnish operand data, and five result busses (32) are connected to the functional units to return results. The width of the operand busses is 41 bits, which is sufficient to communicate either integer or floating point data. This is done using an instruction decoder (18) to apportion a floating point operation which operates on 82-bit floating point operand data into multiple suboperations each associated with a 41-bit suboperand. The operand busses and result busses have an expanded data-handling dimension from the standard integer data width of 32 bits to 41 bits for handling the floating point operands. The floating point functional unit recombines the suboperand data into 82-bits for execution of the floating point operation, and partitions the 82-bit result for output to the result busses. In addition, the excess capacity of the result busses during integer transfers is used to communicate integer flags.