摘要:
A pipelined processor includes an instruction box including a register mapper, to map register operand fields of a set of instructions and an instruction scheduler, fed by the set of instructions, to reorder the issuance of the set of instructions from the instruction processor. The mapped register operand fields are associated with the corresponding instructions of the reordered set of instructions prior to issuance of the instructions. The processor further includes a branch prediction table which maps a stored pattern of past histories associated with a branch instruction to a more likely prediction direction of the branch instruction. The processor further includes a memory reference tagging store associated with the instruction scheduler so that the scheduler can reorder memory reference instructions without knowing the actual memory location addressed by the memory reference instruction.
摘要:
A pipelined processor includes an instruction box including a register mapper, to map register operand fields of a set of instructions and an instruction scheduler, fed by said set of instructions, to reorder the issuance of said set of instructions from said instruction processor. The mapped register operand fields are associated with the corresponding instructions of said reordered set of instructions prior to issuance of the instructions. The processor further includes a branch prediction table which maps a stored pattern of past histories associated with a branch instruction to a more likely prediction direction of the branch instruction. The processor further includes a memory reference tagging store associated with the instruction scheduler so that the scheduler can reorder memory reference instructions without knowing the actual memory location addressed by the memory reference instruction.
摘要:
A pipelined processor includes an instruction unit including a register mapper, to map register operand fields of a set of instructions and an instruction scheduler, fed by the set of instructions, to reorder the issuance of the set of instructions from the processor. The mapped register operand fields are associated with the corresponding instructions of the reordered set of instructions prior to issuance of the instructions. The processor further includes a branch prediction table which maps a stored pattern of past histories associated with a branch instruction to a more likely prediction direction of the branch instruction. The processor further includes a memory reference tagging store associated with the instruction scheduler so that the scheduler can reorder memory reference instructions without knowing the actual memory location addressed by the memory reference instruction.
摘要:
A compiler groups instructions into sets. The sets of instructions are related by data and control dependencies which are unresolvable by the compiler. Sets of instructions having unresolved dependencies are executed in a speculative state of the computer system under the assumption that an exception condition will not occur. However, if an exception condition does occur while executing a set of instructions in the speculative state, that exception condition is detected and the set of instructions is re-executed in a real state of the computer system to resolve the exception condition.
摘要:
The invention expands the period of data stabilization between state devices to be 1.5 times the cycle time minus the clock skew. The invention requires that a clock signal (hereinafter "forwarded clock") be sent with data to the receiving subsystem. Such data is received by a capture latch, which is operated by special logic that receives the forwarded clock, and then proceeds to an ordinary state device in the receiving subsystem that is running synchronously with the receiving subsystem. This state device nominally captures the data 1.5 cycles after it was sent from a state device in the sending subsystem.
摘要:
The invention is directed to a cache memory system in a data processor including a virtual cache memory, a physical cache memory, a virtual to physical translation buffer, a physical to virtual backmap, an Old-PA pointer and a lockout register. The backmap implements invalidates by clearing the valid flags in virtual cache memory. The Old-PA pointer indicates the backmap entry to be invalidated after a reference misses in the virtual cache. The physical address for data written to virtual cache memory is entered to Old-PA pointer by the translation buffer. The lockout register arrests all references to data which may have synonyms in virtual cache memory. The backmap is also used to invalidate any synonyms.
摘要:
Methods and apparatus relating to a hardware move elimination and/or next page prefetching are described. In some embodiments, a logic may provide hardware move eliminations based on stored data. In an embodiment, a next page prefetcher is disclosed. Other embodiments are also described and claimed.
摘要:
A processor includes a cache memory with a data storage unit operating at a first clock frequency, and a tag unit and hit/miss logic operating at a second clock frequency different than the first clock frequency. The data storage unit may advantageously be clocked faster than the tag unit and hit/miss logic, such as two times (2×) faster. The processor may also include a replay mechanism for recovering from data speculation when the hit/miss logic or the tag unit signals that speculated data from the higher clocked data storage unit is, in fact, invalid.
摘要:
The present invention provides a method and apparatus for controlling a processing priority assigned alternately to a first thread and a second thread in a multithreaded processor to prevent deadlock and livelock problems between the first thread and the second thread. In one embodiment, the processing priority is initially assigned to the first thread for a first duration. It is then determined whether the first duration has expired in a given processing cycle. If the first duration has expired, the processing priority is assigned to the second thread for a second duration.
摘要:
A technique is provided for breaking a stalled condition or livelock in a processor having a replay queue. A livelock or stalled condition is detected. One or more instructions are temporarily stored in a replay queue. A release or break in the livelock or stalled condition is detected, and the instructions are then unloaded from the replay queue for replay or re-execution. For a multi-threaded processor, a stall is detected in one of the threads. Instructions of the stalled thread are temporarily stored in a replay queue, except the oldest instruction of the stalled thread which is allowed to replay or re-execute. This allows other threads to have access to execution and replay resources. Eventually, the oldest instruction will execute and retire, which breaks or releases the stalled thread. The instructions stored in the replay queue are then unloaded from the replay queue.