摘要:
An apparatus for enforcing that selected instructions are executed in a correct order, comprising a first content addressable memory for storing load addresses of data read from the memory by the selected instructions. The first content addressable memory comparing the store addresses with the load addresses of data to be written to the memory. The first content addressable memory generating a first signal, if one of the load addresses is identical to a subsequently compared one of the store addresses. The apparatus further including a second content addressable memory for storing and comparing states of the data read and written by the selected instructions. The second content addressable memory generating a second signal, if one of the stored states is identical to one of said compared states. The stored states including a program counter to repeat the execution of the selected instructions upon detecting the first and second signals.
摘要:
A compiler groups instructions into sets. The sets of instructions are related by data and control dependencies which are unresolvable by the compiler. Sets of instructions having unresolved dependencies are executed in a speculative state of the computer system under the assumption that an exception condition will not occur. However, if an exception condition does occur while executing a set of instructions in the speculative state, that exception condition is detected and the set of instructions is re-executed in a real state of the computer system to resolve the exception condition.
摘要:
A mechanism for executing computer instructions in parallel includes a compiler for generating and grouping instructions into a plurality of sets of instructions to be executed in parallel, each set having a unique identification. A computer system having a real state and a speculative state executes the sets in parallel, the computer system executing a particular set of instructions in the speculative state if the instructions of the particular set have dependencies which can not be resolved until the instructions are actually executed. The computer system generates speculative data while executing instructions in the speculative state. Logic circuits are provided to detect any exception conditions which occur while executing the particular set in the speculative state. If the particular set is subject to an exception condition, the instructions of the set are re-executed to resolve the exception condition, and to incorporate the speculative data in the real state of the computer system.
摘要:
There is provided a mechanism for propagating exception conditions in a computer system when instructions are subject to exception conditions. The apparatus includes a set of data registers for storing data manipulated by the instructions of the computer system, and a set of state registers for storing speculative states of data manipulated by the instructions, there being one state register associated with each data register. Furthermore, the apparatus includes a logic circuit, coupled to the set of state registers, for propagating the states from a source one of the state registers to a destination one of the state registers, if data stored in an associated source one of the data registers are used as a source for an associated destination one of data registers, and if data stored in the source data register were manipulated by a particular instruction subject to an exception condition.
摘要:
Breaking replay dependency loops in a processor using a rescheduled replay queue. The processor comprises a replay queue to receive a plurality of instructions, and an execution unit to execute the plurality of instructions. A scheduler is coupled between the replay queue and the execution unit. The scheduler speculatively schedules instructions for execution and increments a counter for each of the plurality of instructions to reflect the number of times each of the plurality of instructions has been executed. The scheduler also dispatches each instruction to the execution unit either when the counter does not exceed a maximum number of replays or, if the counter exceeds the maximum number of replays, when the instruction is safe to execute. A checker is coupled to the execution unit to determine whether each instruction has executed successfully. The checker is also coupled to the replay queue to communicate to the replay queue each instruction that has not executed successfully.
摘要:
Rescheduling multiple micro-operations in a processor using a replay queue. The processor comprises a replay queue to receive a plurality of instructions and an execution unit to execute the plurality of instructions. A scheduler is coupled between the replay queue and the execution unit. The scheduler speculatively schedules instructions for execution and dispatches each instruction to the execution unit. A checker is coupled to the execution unit to determine whether each instruction has executed successfully. The checker is also coupled to the replay queue to communicate to the replay queue each instruction that has not executed successfully.
摘要:
A mechanism for tracking the control flow of instructions in an application and performing one or more optimizations of a processing device, based on the control flow of the instructions in the application, is disclosed. Control flow data is generated to indicate the control flow of blocks of instructions in the application. The control flow data may include annotations that indicate whether optimizations may be performed for different blocks of instructions. The control flow data may also be used to track the execution of the instructions to determine whether an instruction in a block of instructions is assigned to a thread, a process, and/or an execution core of a processor, and to determine whether errors have occurred during the execution of the instructions.
摘要:
Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.
摘要:
Methods and apparatus relating to mechanisms to handle free physical register identifiers for SMT (Simultaneous Multi-Threading) out-of-order processors are described. In some embodiments, a physical register file stores both speculative data and architectural data corresponding to a plurality of registers. A free list logic may maintain free physical register identifiers corresponding to the plurality of registers. An instruction may read the architectural data from the physical register file at dispatch. Other embodiments are also described and claimed.
摘要:
A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store a long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.