摘要:
A pipelined or superscalar processor (10) that executes operations utilizing operand data of variable bit widths improves parallel performance by partitioning a fixed bit width operand (200) into several partial operand fields (215, 216 and 217), and checking for data dependencies, tagging and forwarding data in these fields independently of one another. An instruction decoder (18) concurrently dispatches multiple ROPs to various functional units (20, 21, 22 and 80). Conflicts which arise with respect to register resources are resolved through register renaming. However, implementation of register renaming is difficult when register structures are overlapping. The present invention supports independent dependency checking, tagging and forwarding of partial bit fields of a register operand which, in combination, allow renaming of registers. Therefore, the variable width register operand structure greatly assists the processor to resolve data dependencies. Operands are tagged by a reorder buffer (26) and supplied with data when it becomes available without regard for the type of data. This method of dependency resolution supports parallel performance of operations and provides a substantial improvement in overall speed of processing. Thus, the processor promotes parallel processing of operations that act upon overlapping data structures which otherwise resist parallel handling.
摘要:
A pipelined or superscalar processor (10) that executes operations utilizing operand data of variable bit widths improves parallel performance by partitioning a fixed bit width operand (200) into several partial operand fields (215, 216 and 217), and checking for data dependencies, tagging and forwarding data in these fields independently of one another. An instruction decoder (18) concurrently dispatches multiple ROPs to various functional units (20, 21, 22 and 80). Conflicts which arise with respect to register resources are resolved through register renaming. However, implementation of register renaming is difficult when register structures are overlapping. The present invention supports independent dependency checking, tagging and forwarding of partial bit fields of a register operand which, in combination, allow renaming of registers. Therefore, the variable width register operand structure greatly assists the processor to resolve data dependencies. Operands are tagged by a reorder buffer (26) and supplied with data when it becomes available without regard for the type of data. This method of dependency resolution supports parallel performance of operations and provides a substantial improvement in overall speed of processing. Thus, the processor promotes parallel processing of operations that act upon overlapping data structures which otherwise resist parallel handling.
摘要:
A processor which includes a fetch program counter circuit and an execute program counter circuit is disclosed. The fetch program counter circuit provides less significant program counter value bits in addition to a fetch program counter value. The execute program counter circuit generates an execute program counter value using the less significant program counter value bits. The execute program counter circuit receives a plurality of less significant program counter bit values and selects a single less significant program counter bit value thus generating execute program counter values in a multiple pipeline processor.
摘要:
A superscalar microprocessor is provided with a reorder buffer for storing the speculative state of the microprocessor and a register file for storing the real state of the microprocessor. A flags register stores the real state of flags that are updated by flag modifying instructions which are executed by the functional units of the microprocessor. To enhance the performance of the microprocessor with respect to conditional branching instructions, the reorder buffer includes a flag storage area for storing flags that are updated by flag modifying instructions. The flags are renamed to make possible the earlier execution of branch instructions which depend on flag modifying instructions. If a flag is not yet determined, then a flag tag is associated with the flag storage area in place of that flag until the actual flag value is determined. A flag operand bus and a flag tag bus are provided between the flag storage area and the branching functional unit so that the requested flag or flag tags are provided to instructions which are executed in the branching functional unit.
摘要:
A superscalar microprocessor is provided with a reorder buffer for storing the speculative state of the microprocessor and a register file for storing the real state of the microprocessor. A flags register stores the real state of flags that are updated by flag modifying instructions which are executed by the functional units of the microprocessor. To enhance the performance of the microprocessor with respect to conditional branching instructions, the reorder buffer includes a flag storage area for storing flags that are updated by flag modifying instructions. The flags are renamed to make possible the earlier execution of branch instructions which depend on flag modifying instructions. If a flag is not yet determined, then a flag tag is associated with the flag storage area in place of that flag until the actual flag value is determined. A flag operand bus and a flag tag bus are provided between the flag storage area and the branching functional unit so that the requested flag or flag tags are provided to instructions which are executed in the branching functional unit.
摘要:
In a microprocessor system, a program counter circuit generates a program counter value that represents a retrieved instruction and that includes a more significant portion, a less significant portion, and a carry signal for use in determining a next program counter value. An execute program counter circuit generates an execute program counter value from the less significant program counter value and from the carry signal. The execute program counter value represents a program counter value of an executed instruction.
摘要:
A processor which includes a fetch program counter circuit and an execute program counter circuit is disclosed. The fetch program counter circuit provides less significant program counter value bits in addition to a fetch program counter value. The execute program counter circuit generates an execute program counter value using the less significant program counter value bits. The execute program counter circuit receives a plurality of less significant program counter bit values and selects a single less significant program counter bit value thus generating execute program counter values in a multiple pipeline processor.
摘要:
A processor which includes a fetch program counter circuit and an execute program counter circuit is disclosed. The fetch program counter circuit provides less significant program counter value bits in addition to a fetch program counter value. The execute program counter circuit generates an execute program counter value using the less significant program counter value bits. The execute program counter circuit receives a plurality of less significant program counter bit values and selects a single less significant program counter bit value thus generating execute program counter values in a multiple pipeline processor.
摘要:
A method and apparatus are disclosed for implementing early release of speculatively read data in a hardware transactional memory system. A processing core comprises a hardware transactional memory system configured to receive an early release indication for a specified word of a group of words in a read set of an active transaction. The early release indication comprises a request to remove the specified word from the read set. In response to the early release request, the processing core removes the group of words from the read set only after determining that no word in the group other than the specified word has been speculatively read during the active transaction.
摘要:
A method and apparatus for compiling software written to be executed on a microprocessor that supports at least one hardware transactional memory function is provided. A compiler that supports at least one software transactional memory function is adapted to include a runtime system that maps between the at least one software transactional memory function and the at least one hardware transactional memory instruction.