摘要:
A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store a long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
摘要:
A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
摘要:
A computer processor includes a multiplexer having a first input, a second input, and an output, and a scheduler coupled to the multiplexer first input. The processor further includes an execution unit coupled to the multiplexer output. The execution unit is adapted to receive a plurality of instructions from the multiplexer. The processor further includes a replay system coupled to the second multiplexer input and the scheduler. The replay system replays an instruction that has not correctly executed by sending a stop scheduler signal to the scheduler and sending the instruction to the multiplexer.
摘要:
A processor includes a memory execution unit for executing load and store instructions and a replay system for replaying instructions which have not executed properly. The memory execution unit including an invalid store flag that is set for a store instruction if the replay system detects that the store instruction has not executed properly and is cleared if the store instruction has executed properly. If an invalid store flag is set for a store instruction, the replay system replays load instructions which are programmatically younger than the invalid store instruction until the store instruction executes properly.
摘要:
A processor includes a memory execution unit for executing load and store instructions and a replay system for replaying instructions which have not executed properly. The memory execution unit including an invalid store flag that is set for a store instruction if the replay system detects that the store instruction has not executed properly and is cleared if the store instruction has executed properly. If an invalid store flag is set for a store instruction, the replay system replays load instructions which are programmatically younger than the invalid store instruction until the store instruction executes properly.
摘要:
A computer processor includes a multiplexer having a first input, a second input, a third input, and an output. The processor further includes a scheduler coupled to the multiplexer first input, an execution unit coupled to the multiplexer output, and a replay system that has an input coupled to the multiplexer output. The replay system includes a first checker coupled to the replay system input and the second multiplexer input, and a second checker coupled to the first checker and the third multiplexer input.
摘要:
A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a replay queue coupled to the checker for temporarily storing one or more instructions for replay. The replay queue may be used to store a long latency instruction, such as a load in which data must be retrieved from an external memory device. The long latency instruction and possibly one or more dependent instruction are stored in the replay queue until the long latency instruction is ready to be executed (e.g., data for the load instruction has been retrieved from external memory). Once the long latency instruction is ready to be executed, (e.g., the data is available), the long latency instruction may then be unloaded from the replay queue for re-execution.
摘要:
Breaking replay dependency loops in a processor using a rescheduled replay queue. The processor comprises a replay queue to receive a plurality of instructions, and an execution unit to execute the plurality of instructions. A scheduler is coupled between the replay queue and the execution unit. The scheduler speculatively schedules instructions for execution and increments a counter for each of the plurality of instructions to reflect the number of times each of the plurality of instructions has been executed. The scheduler also dispatches each instruction to the execution unit either when the counter does not exceed a maximum number of replays or, if the counter exceeds the maximum number of replays, when the instruction is safe to execute. A checker is coupled to the execution unit to determine whether each instruction has executed successfully. The checker is also coupled to the replay queue to communicate to the replay queue each instruction that has not executed successfully.
摘要:
Rescheduling multiple micro-operations in a processor using a replay queue. The processor comprises a replay queue to receive a plurality of instructions and an execution unit to execute the plurality of instructions. A scheduler is coupled between the replay queue and the execution unit. The scheduler speculatively schedules instructions for execution and dispatches each instruction to the execution unit. A checker is coupled to the execution unit to determine whether each instruction has executed successfully. The checker is also coupled to the replay queue to communicate to the replay queue each instruction that has not executed successfully.
摘要:
In a multi-threaded processor, thread priority variables are set up in memory. The actual assignment of thread priority is based on the expiration of a thread precedence counter. To further augment, the effectiveness of the thread precedence counters, starting counters are associated with each thread that serve as a multiplier for the value to be used in the thread precedence counter. The value in the starting counters are manipulated so as to prevent one thread from getting undue priority to the resources of the multi-threaded processor.