摘要:
A write-back cache memory and method for maintaining coherency within a write-back cache memory are disclosed. The write-back cache memory includes a number of cache lines for storing data associated with addresses within an associated memory. Each of the cache lines comprises multiple byte sets. The write-back cache memory also includes coherency indicia for identifying each byte set among the multiple byte sets within a cache line which contains data that differs from data stored in corresponding addresses within the associated memory. The write-back cache memory further includes cache control logic, which, upon replacement of a particular cache line within the write-back cache memory, writes only identified byte sets to the associated memory, such that memory accesses and bus utilization are minimized.
摘要:
A method for speculatively performing store instructions in a parallel processing computer system, the computer system including a completion buffer unit, includes comparing statuses between a first store instruction and at least one second instruction in the completion buffer unit, the at least one second instruction scheduled for completion before the first store instruction, and speculatively completing the first store instruction before the at least one second instruction when the statuses of the first store instruction do not conflict with the at least one second instruction. In another method aspect, speculatively performing store instructions includes forming a general purpose register (GPR) allocation deallocation table, the table including status fields for a plurality of instructions in a completion buffer unit, comparing the status fields of each of the plurality of instructions to a store instruction of the plurality of instructions, and speculatively completing the store instruction when the status fields for the store instruction do not conflict with the status fields for the plurality of instructions.
摘要:
A method and system are disclosed for managing the deallocation of a rename buffer allocated to an update instruction within a processor. The processor has a number of rename buffers for temporarily storing information associated with instructions executed by the processor, a number of registers, and a memory. According to the present invention, an update instruction is dispatched to the processor for execution. A particular rename buffer is then allocated to the update instruction. An effective address is generated for the update instruction, wherein the effective address specifies an address within the memory to be accessed by the update instruction. Next, the effective address is stored within the particular rename buffer. Prior to completion of the access to the effective address within memory, the effective address is transferred from the particular rename buffer to a particular one of the number of registers. Thereafter, the particular rename buffer is deallocated, wherein processor performance is enhanced by improved rename buffer availability.
摘要:
A method for implementing a four-way least recently used cache line replacement scheme in a four-way cache memory is disclosed. The cache memory includes multiple cache lines, and each cache line includes four congruence sets. In accordance with the present disclosure, a 5-bit Least Recently Used (LRU) field is associated with each of the cache lines within the cache memory. For a particular cache line, a set number of a least recently used set among the four congruence sets is stored in any two bits of the LRU field associated with that cache line. Next, a set number of the second least recently used set among the four congruence sets is stored in another two bits of the same LRU field associated with the same cache line. Finally, a last bit of the 5-bit LRU field is set to a specific state in response to a determination of which one of the remaining two sets is the second most recently used set.
摘要:
A method and system for enhanced system management operations in a superscalar data processing system. Those supervisory level instructions which execute selected privileged operations within protected memory space are first identified as not requiring a full context synchronization. Each time execution of such an instruction is initiated an enable special access (ESA) instruction is executed as an entry point to that instruction or group of instructions. A portion of the machine state register for the data processing system is stored and the machine state register is then modified as follows: a problem bit is set, changing the execution privilege state to "supervisor;" external interrupts are disabled; and access privilege state bit is set; and, a special access mode bit is set, allowing execution of special instructions. The instructions which execute the selected privileged operations within the protected memory space are then executed. A disable special access (DSA) instruction is then executed which restores the bits within the machine state register which were modified during the ESA instruction. The ESA and DSA instructions are implemented without modifying the instruction stream by utilizing user level procedure calls, thereby reducing the overhead of the branch table necessary to determine the desired execution path.
摘要:
A method for reducing dispatch stalls includes tracking allocation and deallocation of real rename buffers for instructions dispatched by a dispatch unit, and providing at least one virtual rename buffer for allocation of an instruction when the real rename buffers have been allocated. The method further includes tagging the instruction allocated to the at least one virtual rename buffer with a rename buffer busy signal, wherein the rename buffer busy signal indicates to an execution unit that the instruction cannot be completed. An efficient system for utilization of rename buffers in a superscalar processor includes a plurality of rename buffers, a dispatch unit coupled to the plurality of rename buffers, and an allocation/deallocation table coupled to the dispatch unit and the plurality of rename buffers. Further, the table includes a plurality of real rename buffer slots and at least one virtual rename buffer slot. Additionally, a rename busy signal is provided for an instruction allocated to the at least one virtual rename buffer slot.
摘要:
A method and device of executing a load multiple instruction in a superscaler microprocessor is provided. The method comprises the steps of dispatching a load multiple instruction to a load/store unit, wherein the load/store unit begins execution of a dispatched load multiple instruction, and wherein the load multiple instruction loads data from memory into a plurality of registers. The method further includes the step of maintaining a table that lists each register of the plurality of registers and that indicates when data has been loaded into each register by the executing load multiple instruction. The method concludes by executing an instruction that is dependent upon source operand data loaded by the load multiple instruction into a register of the plurality of registers indicated by the instruction as a source register, prior to the load multiple instruction completing its execution, when the table indicates the source operand data has been loaded into the source register. Also, according to the present invention, a method of executing a store multiple instruction in a superscaler microprocessor is provided. This method comprises the steps of dispatching a store multiple instruction to a load/store unit, whereupon the load/store unit begins executing the store multiple instruction, wherein the load store instruction stores data from a plurality of registers to memory; and executing a fixed point instruction that is dependent upon data being stored by the store multiple instruction from a register of the plurality of registers indicated by the fixed point instruction as a source register, prior to the store multiple instruction completing its execution, but prohibiting the executing fixed point instruction from writing to a register of the plurality of registers prior to the store multiple instruction completing.
摘要:
A method and apparatus for executing instructions within a processor which completes instructions according to a program order are disclosed. The processor has multiple rename buffers for temporarily storing results of instructions, a number of registers, and an execution unit. The execution unit has a reservation data structure comprising a plurality of entries for storing instructions to be executed by the execution unit and a single operand buffer for storing one or more operands of a single instruction. According to the present invention, an instruction is received at the execution unit. The instruction is then stored within the reservation data structure within the execution unit in association with information specifying a source of an operand of the instruction. Sources of operands of instructions include the rename buffers and the registers. A determination is then made if the instruction is a next instruction to be executed by the execution unit. In response to a determination that the instruction is the next instruction to be executed by the execution unit, the operand of the instruction is loaded from the specified source into the single operand entry. The instruction is then executed within the execution unit utilizing the operand within the single operand entry. By utilizing only a single operand buffer, processor chip area allocated to operand storage within the execution unit is reduced.