摘要:
In a microprocessor, an apparatus and method for performing memory functions and issuing bus cycles. Special microinstructions are stored in microcode ROM. These microinstructions are used to perform the memory functions and to generate the special bus cycles. Initially, an address corresponding to a requested operation to be performed is generated for one of these special microinstructions. That special microinstruction, along with its address, is then transmitted over the bus to the various units of the microprocessor. When each of the units receives the microinstruction, it determines whether that microinstruction is to be ignored based on the address. If a particular unit ignores the microinstruction, the microinstruction is forwarded to subsequent units in the pipeline for processing. Otherwise, if that particular unit performs the requested operation as specified by the microinstruction's address.
摘要:
A method and apparatus for dispatching load operations in a computer system. The present invention includes a method and apparatus for determining when the load operation is ready for dispatched to memory. The load operation is then scheduled to dispatch from memory and then dispatched to memory. In the present invention, a load is determined ready when it is no longer blocked, such that there is no condition which produces a resource or address dependency causing the load to be blocked.
摘要:
The present invention provides for executing store instructions with a processor. The present invention executes each of the store instructions by producing the data that is to be stored and by calculating the destination address to which the data is to be stored. In the present invention, the store instructions are executed to produce the destination address of the store instruction earlier than the prior art.
摘要:
A method and apparatus for performing load operations in a computer system. The present invention includes a method and apparatus for dispatching the load operation to be executed. The present invention halts the execution of the load operation when a dependency exists between the load operation and another memory operation currently pending in the system. When the dependency no longer exists, the present invention redispatches the load operation so that it completes.
摘要:
A data cache and a plurality of companion fill buffers having corresponding tag matching circuitry are provided to a computer system. Each fill buffer independently stores and tracks a replacement cache line being filled with data returning from main memory in response to a cache miss. When the cache fill is completed, the replacement cache line is output for the cache tag and data arrays of the data cache if the memory locations are cacheable and the cache line has not been snoop hit while the cache fill was in progress. Additionally, the fill buffers are organized and provided with sufficient address and data ports as well as selectors to allow the fill buffers to respond to subsequent processor loads and stores, and external snoops that hit their cache lines while the cache fills are in progress. As a result, the cache tag and data arrays of the data cache can continue to serve subsequent processor loads and stores, and external snoops, while one or more cache fills are in progress, without ever having to stall the processor.
摘要:
A computer system having a mechanism for maintaining processor ordering during out-of-order instruction execution is disclosed wherein load memory instructions are accessed according to program order and executed out-of-order in relation to the program order where appropriate. Processors in the system snoop an external bus for bus transactions that conflict with completed load memory instructions before committing results of the completed load memory instructions to an architectural state.
摘要:
A data cache and a plurality of companion fill buffers having corresponding tag matching circuitry are provided to a computer system. Each fill buffer independently stores and tracks a replacement cache line being filled with data returning from main memory in response to a cache miss. When the cache fill is completed, the replacement cache line is output for the cache tag and data arrays of the data cache if the memory locations are cacheable and the cache line has not been snoop hit while the cache fill was in progress. Additionally, the fill buffers are organized and provided with sufficient address and data ports as well as selectors to allow the fill buffers to respond to subsequent processor loads and stores, and external snoops that hit their cache lines while the cache fills are in progress. As a result, the cache tag and data arrays of the data cache can continue to serve subsequent processor loads and stores, and external snoops, while one or more cache fills are in progress, without ever having to stall the processor.
摘要:
An apparatus for maintaining processor ordering in a multi-processor computer system wherein loads are performed speculatively. Speculative loads of each processor are temporarily stored in their respective processors' load buffer. When one of the processors performs a store, a snoop operation is performed on the other processors' load buffers. If the snoop results in a hit, a determination is made as to whether that load buffer contains any prior conflicting speculative loads which have been completed. If the load buffer does contain a prior conflicting load, a processor ordering violation signal is generated. In response to this signal, the violating load and all subsequent operations are canceled and re-executed at a later time.
摘要:
A macro instruction is provided for a microprocessor which allows a programmer to specify a base value, index, scale factor and displacement value for calculating an effective address and returning that result in a single clock cycle. The macro instruction is converted into a micro operation which is provided to the single-cycle execution unit with the required source operands for performing the calculation. Within the single-cycle execution unit, the index and scale factor are provided to a left shifter for multiplying the two values. The result of the left shift operation is added to the sum of the base and displacement. This results in the effective address which is then returned from the single-cycle execution unit to a predetermined destination. This provides for the calculation of an effective address in a single cycle pipeline execution unit that is independent of the memory system execution units.
摘要:
A method and apparatus for performing operations with a processor in a computer system. Load operations are performed by use of a dispatch pipeline and a memory execution pipeline. The dispatch pipeline dispatches the load operation for execution by the processor, while the memory execution pipeline controls the execution of the load operation to memory. The present invention reduces the latency involved in executing a load operation by coupling the execution of the two pipelines during execution of the load operation.