摘要:
A decoder that includes a micro-alias register to store information from a micro-operation for use by later micro-operations in the micro-operation flow. The decoder includes one or more XLAT PLAs that produces PLA control micro-operations ("Cuops"), a microcode sequencing unit that produces microcode Cuops, and an aliasing mechanism that extracts fields and stores them in macro-alias registers. A multiplexer is provided to select the appropriate Cuop to be stored in a Cuop register. Multiple Cuops may issue each cycle. A multiplexer is coupled to select one of the Cuops and to store predetermined fields in the micro-alias register for use by subsequent Cuops. Micro-alias data and macro-alias data can be utilized simultaneously with a Cuop to form an Auop.
摘要:
A state recovery and restart method that simplifies assist handling. The recovery and restart method also handles micro-branch mispredictions. An assist sequence is executed in microcode to assist an error-causing macroinstruction. If data is required from an error-causing macroinstruction, it is fetched, decoded, and macro-alias registers are restored with macro-alias data. To recover the state of the micro-alias registers, micro-alias data from a micro-operation of the flow may be loaded into the micro-alias register. Subsequently, control returns to the Micro-operation Sequence (MS) unit to issue further error correction Control micro-operations (Cuops). In order to simplify restart, the Cuops originating from the error-causing macroinstruction supplied by the translate programmable logic arrays (XLAT PLAs) are loaded into the Cuop registers, with their valid bits unasserted. If microcode requests a restart beginning at one of the Cuops stored in the Cuop register, then the bits for that Cuop and subsequent Cuops are marked valid. Thus, the instruction can be restarted anywhere within the microcode sequence.
摘要:
Decoding circuitry and a method supplying an immediate field that is issued from a decoder. A macroinstruction is supplied to the decoding circuit, which generates a first micro-operation that includes a first aliasing field and a first immediate field. The first aliasing field indicates the source of the micro-operation that will eventually be issued from the decoder. If the source is the first immediate field, then the alias field is further examined to determine the interpretation to be placed upon the data. The data may be interpreted literally, or as an address into a constant ROM, thereby providing an ability to output wide, 32-bit immediate data from a narrower, 9-bit input addresses. Additional sources for immediate data include macro-alias registers, macro-branch information, and micro-branch information.
摘要:
A method and apparatus for decoding a macroinstruction, the macroinstruction including an operational code (opcode) and a specification of an operand, is described. The method includes two primary steps, which are performed either serially or in parallel. When performed serially, the steps may be performed in any order. The first primary step involves the generation of a first micro-instruction, specifying a first micro-operation, the first micro-instruction being derived from the specification of the operand of the macroinstruction. The second primary step involves the generation of a second micro-instruction, specifying a second micro-operation, the second micro-instruction being derived from the opcode of macroinstruction. The specification of the operand may specify the operand to be either a memory operand or a register operand in a manner that necessitates data processing or manipulation prior to a memory access or to execution of the second micro-instruction. More specifically, the specification of the operand in the macroinstruction may require alignment of operands retrieved from registers, prior to execution of the second instruction. In this case, the first micro-instruction may require a shift operation after retrieval of the operands.
摘要:
A microprocessor including memory storage into which ISA customer code routines can be stored after having been decoded into their machine-native microinstructions. The customer code store is not subject to eviction and the like, as a cache memory would be. ISA level code can specify a routine for storage into the customer code store, at a time prior to its execution. The customer code store thus serves as a write-once execute-many library of pre-decoded routines which ISA level applications can subsequently use, permitting a system manufacturer to create a highly customized and optimized system.
摘要:
A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a replay queue coupled to the checker for temporarily storing one or more instructions for replay. The replay queue may be used to store a long latency instruction, such as a load in which data must be retrieved from an external memory device. The long latency instruction and possibly one or more dependent instruction are stored in the replay queue until the long latency instruction is ready to be executed (e.g., data for the load instruction has been retrieved from external memory). Once the long latency instruction is ready to be executed, (e.g., the data is available), the long latency instruction may then be unloaded from the replay queue for re-execution.
摘要:
Method, apparatus, and system for a programmable event driven yield mechanism that may activate other threads. The yield mechanism may allow triggering of a service thread that may execute currently with a main thread upon occurrence of an architecturally-defined condition. The service thread may be activated, in response to the condition, with limited intervention of an operating system. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect an architecturally-defined condition. The apparatus may include an event handler to handle a yield event generated when the architecturally-defined condition has been detected. An architectural mechanism, including processor instructions and channel registers, may be utilized to allow user-level code to enable the yield event mechanism. Other embodiments are also described and claimed.
摘要:
A method for maintaining an instruction in a pipelined processor using inuse fields. The method involves receiving a read request for an instruction, sending the instruction in response to the read request and setting an inuse field associated with the instruction to inuse. Alternate embodiments of the method involve transmitting the instruction in response to the read request, receiving a notification of instruction retirement and resetting the inuse field in the ITLB. The method can also be used in the ICACHE in which inuse fields are associated with each instruction stored in the ICACHE. Other embodiments of the method can be used concurrently in the ITLB and the ICACHE as a resource tracking mechanism to maintain resources.
摘要:
Instructions are fetched and issued by an instruction fetch and issue circuit with the instructions' sizes in program order. An allocate circuit allocates reservation station entries in a reservation station circuit, and reorder buffer entries in a reorder circuit, for the issued instructions in order, storing the instructions' sizes in the allocated reorder buffer entries. The reservation and dispatch circuit dispatches the issued instructions to the execution circuits for execution when they are ready. The execution circuits store the result data including target addresses of branch instructions into the corresponding reorder buffer entries. During each retirement operation, a retire circuit reads the instruction sizes and the target addresses for a predetermined number of issued instructions from their allocated reorder buffer entries. The retire circuit determines two or more speculative next instruction pointers for each of the issued instructions, factoring into consideration whether the issued instructions are branch instructions or not, and their relative positions to each other. Each of the speculative next instruction pointers indicates what the next instruction pointer for the processor should be for retiring a particular combination of the result data values of the issued instructions under consideration. The retire circuit conditionally updates the next instruction pointer with one of the speculative next instruction pointers, depending on how many, if any, of the instructions can actually retire, and whether any of the actually retiring instructions are branch instructions.
摘要:
A processor is provided that includes an execution unit for executing instructions and a replay system for replaying instructions which have not executed properly. The replay system is coupled to the execution unit and includes a checker for determining whether each instruction has executed properly and a plurality of replay queues or replay queue sections coupled to the checker for temporarily storing one or more instructions for replay. In one embodiment, thread-specific replay queue sections may each be used to store a long latency instruction for each thread until the long latency instruction is ready to be executed (e.g., data for a load instruction has been retrieved from external memory). By storing the long latency instruction and its dependents in a replay queue section for one thread which has stalled, execution resources are made available for improving the speed of execution of other threads which have not stalled.