摘要:
A method and circuitry for coordinating exceptions in a processor. The processor generates a result data value and an exception data value for each instruction wherein the exception data value specifies whether the corresponding instruction causes an exception. The processor commits the result data values to an architectural state of the processor in the sequential program order, and fetches an exception handler to processes the exception if the exception is indicated by one of the exception data values. The processor fetches an asynchronous event handler to processes an asynchronous event if the asynchronous event is detected while the result data values are committed to the architectural state of the processor.
摘要:
A method and apparatus for performing operations with a processor in a computer system. Load operations are performed by use of a dispatch pipeline and a memory execution pipeline. The dispatch pipeline dispatches the load operation for execution by the processor, while the memory execution pipeline controls the execution of the load operation to memory. The present invention reduces the latency involved in executing a load operation by coupling the execution of the two pipelines during execution of the load operation.
摘要:
A method and apparatus for performing store operations that includes calculating the address and obtaining the data for the store operation. The address represents the memory location to which the data is to be stored. Once the address is calculated and the data obtained, the store operation is committed to processor state. The store operation may be dispatched to memory to complete the execution of the store operation.
摘要:
A speculative execution out of order processor comprising a reorder circuit containing a plurality of physical registers that buffer speculative execution results for integer and floating-point operations, and a real register circuit containing a plurality of committed state registers that buffer committed execution results for either integer or floating-point operations, depending on the register. The reorder and real register circuits read the speculative and committed source data values for incoming micro-ops, and transfer the speculative and committed source data values over to a micro-op dispatch circuit over a common data path. A retire logic circuit commits the speculative execution results to an architectural state by transferring the speculative execution results from the reorder circuit to the real register circuit.
摘要:
Pipeline lengthening in functional units likely to be involved in a writeback conflict is implemented to avoid conflicts. Logic circuitry is provided for comparing the depths of two concurrently executing execution unit pipelines to determine if a conflict will develop. When it appears that two execution units will attempt to write back at the same time, the execution unit having a shorter pipeline will be instructed to add a stage to its pipeline, storing its result in a delaying buffer for one clock cycle. After the conflict has been resolved, the instruction to lengthen the pipeline of a given functional unit will be rescinded. Multistage execution units are designed to signal a reservation station to delay the dispatch of various instructions to avoid conflicts between execution units.
摘要:
An allocator assigns entries for a circular buffer. The allocator receives requests for storing data in entries of the circular buffer, and generates a head pointer to identify a starting entry in the circular buffer for which circular buffer entries are not allocated. In addition to pointing to an entry in the circular buffer, the head pointer includes a wrap bit. The allocator toggles the wrap bit each time the allocator traverses the linear queue of the circular buffer. A tail pointer is generated, including the wrap bit, to identify an ending entry in the circular buffer for which circular buffer entries are allocated. In response to the request for entries, the allocator sequentially assigns entries for the requests located between the head pointer and the tail pointer. The allocator has application for use in a microprocessor performing out-of-order dispatch and speculative execution. The allocator is coupled to a reorder buffer, configured as a circular buffer, to permit allocation of entries. The allocator utilizes an all or nothing allocation policy, such that either all or no incoming instructions are allocated during an allocation period.
摘要:
A non-blocking translation lookaside buffer is described for use in a microprocessor capable of processing speculative and out-of-order instructions. Upon the detection of a fault, either during a translation lookaside buffer hit or a page table walk performed in response to a translation lookaside buffer miss, information associated with the faulting instruction is stored within a fault register within the translation lookaside buffer. The stored information includes the linear address of the instruction and information identifying the age of instruction. In addition to storing the information within the fault register, a portion of the information is transmitted to a reordering buffer of the microprocessor for storage therein pending retirement of the faulting instruction. Prior to retirement of the faulting instruction, the translation lookaside buffer continues to process further instructions. Upon retirement of each instruction, the reordering buffer determines whether a fault had been detected for that instruction and, if so, the microprocessor is flushed. Then, a branch is taken into microcode. The microcode accesses the linear address and other information stored within the fault register of the translation lookaside buffer and handles the fault. The system is flushed and the microcode is executed only for faulting instructions which actually retire. As such, faults detected while processing speculative instructions based upon mispredicted branches do not prevent further address translations and do not cause the system to be flushed. Method and apparatus implementations are described herein.
摘要:
Instructions are fetched and issued by an instruction fetch and issue circuit with the instructions' sizes in program order. An allocate circuit allocates reservation station entries in a reservation station circuit, and reorder buffer entries in a reorder circuit, for the issued instructions in order, storing the instructions' sizes in the allocated reorder buffer entries. The reservation and dispatch circuit dispatches the issued instructions to the execution circuits for execution when they are ready. The execution circuits store the result data including target addresses of branch instructions into the corresponding reorder buffer entries. During each retirement operation, a retire circuit reads the instruction sizes and the target addresses for a predetermined number of issued instructions from their allocated reorder buffer entries. The retire circuit determines two or more speculative next instruction pointers for each of the issued instructions, factoring into consideration whether the issued instructions are branch instructions or not, and their relative positions to each other. Each of the speculative next instruction pointers indicates what the next instruction pointer for the processor should be for retiring a particular combination of the result data values of the issued instructions under consideration. The retire circuit conditionally updates the next instruction pointer with one of the speculative next instruction pointers, depending on how many, if any, of the instructions can actually retire, and whether any of the actually retiring instructions are branch instructions.
摘要:
In a pipelined processor, an apparatus for handling string operations. When a string operation is received by the processor, the length of the string as specified by the programmer is stored in a register. Next, an instruction sequencer issues an instruction that computes the register value minus a pre-determined number of iterations to be issued into the pipeline. Following the instruction, the pre-determined number of iterations are issued to the pipeline. When the instruction returns with the calculated number, the instruction sequencer then knows exactly how many iterations should be executed. Any extra iterations that had initially been issued are canceled by the execution unit, and additional iterations are issued as necessary. A loop counter in the instruction sequencer is used to track the number of iterations.
摘要:
A mechanism and method for renaming flags within a register alias table ("RAT") to increase processor parallelism and also providing and using flag masks associated with individual instructions. In order to reduce the amount of data dependencies between instructions that are concurrently processed, the flags used by these instructions are renamed. In general, a RAT unit provides register renaming to provide a larger physical register set than would ordinarily be available within a given macroarchitecture's logical register set (such as the Intel architecture or PowerPC or Alpha designs, for instance) to eliminate false data dependencies between instructions that reduce overall superscalar processing performance for the microprocessor. The renamed flag registers contain several flag bits and various flag bits may be updated or read by different instructions. Also, static and dynamic flag masks are associated with particular instructions and indicate which flags are capable of being updated by a particular instruction and also indicate which flags are actually updated by the instruction. Static flag masks are used in flag renaming and dynamic flag masks are used at retirement. The invention also discovers cases in which a flag register is required that is a superset of the previously renamed flag register portion.