摘要:
In a multi-threaded processor, thread priority variables are set up in memory. The actual assignment of thread priority is based on the expiration of a thread precedence counter. To further augment, the effectiveness of the thread precedence counters, starting counters are associated with each thread that serve as a multiplier for the value to be used in the thread precedence counter. The value in the starting counters are manipulated so as to prevent one thread from getting undue priority to the resources of the multi-threaded processor.
摘要:
In a multi-threaded processor, thread priority variables are set up in memory. The actual assignment of thread priority is based on the expiration of a thread precedence counter. To further augment, the effectiveness of the thread precedence counters, starting counters are associated with each thread that serve as a multiplier for the value to be used in the thread precedence counter. The value in the starting counters are manipulated so as to prevent one thread from getting undue priority to the resources of the multi-threaded processor.
摘要:
In a multi-threaded processor, thread priority variables are set up in memory. The actual assignment of thread priority is based on the expiration of a thread precedence counter. To further augment, the effectiveness of the thread precedence counters, starting counters are associated with each thread that serve as a multiplier for the value to be used in the thread precedence counter. The value in the starting counters are manipulated so as to prevent one thread from getting undue priority to the resources of the multi-threaded processor.
摘要:
In a multi-threaded processor, thread priority variables are set up in memory. The actual assignment of thread priority is based on the expiration of a thread precedence counter. To further augment, the effectiveness of the thread precedence counters, starting counters are associated with each thread that serve as a multiplier for the value to be used in the thread precedence counter. The value in the starting counters are manipulated so as to prevent one thread from getting undue priority to the resources of the multi-threaded processor.
摘要:
In a multi-threaded processor, thread priority variables are set up in memory. The actual assignment of thread priority is based on the expiration of a thread precedence counter. To further augment, the effectiveness of the thread precedence counters, starting counters are associated with each thread that serve as a multiplier for the value to be used in the thread precedence counter. The value in the starting counters are manipulated so as to prevent one thread from getting undue priority to the resources of the multi-threaded processor.
摘要:
A computer processor includes a multiplexer having a first input, a second input, a third input, and an output. The processor further includes a scheduler coupled to the multiplexer first input, an execution unit coupled to the multiplexer output, and a replay system that has an input coupled to the multiplexer output. The replay system includes a first checker coupled to the replay system input and the second multiplexer input, and a second checker coupled to the first checker and the third multiplexer input.
摘要:
A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store a long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.
摘要:
Breaking replay dependency loops in a processor using a rescheduled replay queue. The processor comprises a replay queue to receive a plurality of instructions, and an execution unit to execute the plurality of instructions. A scheduler is coupled between the replay queue and the execution unit. The scheduler speculatively schedules instructions for execution and increments a counter for each of the plurality of instructions to reflect the number of times each of the plurality of instructions has been executed. The scheduler also dispatches each instruction to the execution unit either when the counter does not exceed a maximum number of replays or, if the counter exceeds the maximum number of replays, when the instruction is safe to execute. A checker is coupled to the execution unit to determine whether each instruction has executed successfully. The checker is also coupled to the replay queue to communicate to the replay queue each instruction that has not executed successfully.
摘要:
Rescheduling multiple micro-operations in a processor using a replay queue. The processor comprises a replay queue to receive a plurality of instructions and an execution unit to execute the plurality of instructions. A scheduler is coupled between the replay queue and the execution unit. The scheduler speculatively schedules instructions for execution and dispatches each instruction to the execution unit. A checker is coupled to the execution unit to determine whether each instruction has executed successfully. The checker is also coupled to the replay queue to communicate to the replay queue each instruction that has not executed successfully.
摘要:
A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.