摘要:
To support load instructions which execute out-of-order with respect to store instructions, a mechanism is implemented to detect (and correct) the occurrences where a load instruction executed prior to a logically prior store instruction, and where the load instruction received data for the location prior to being modified by the store instruction, and the correct data for the load instruction included bytes from the store instruction. Additionally, to execute store instructions out-of-order with respect to load instructions, a mechanism is implemented to keep a store instruction from destroying data that will be used by a logically earlier load instruction. Further, to support load instructions that are executed out-of-order with respect to each other, a mechanism is implemented to insure that any pair of load instructions (which access at least one byte in common) return data consistent with executing the load instructions in order.
摘要:
An apparatus and method for processing multiple cache misses to a single cache line in an information handling system which includes a miss queue for storing requests for data not located in a level one cache and a comparator for comparing requests for data stored in the miss queue to determine if there are multiple requests for data located in the same cache line of a level two cache. Each new request for data from the same cache line of the level two cache as an older original request for data in the miss queue is marked as a load hit reload. The requests marked as load hit reloads are then grouped together with the matching original request and forwarded together to the level two cache wherein the original request requests the data from level two cache. The load hit reload requests do not access level two cache but instead bypass access of level two cache by extracting data from the cache line outputted from level two cache for the matching original request. The present invention reduces the number of accesses to the level two cache and allows data requests to be satisfied in parallel versus serially when multiple successive level one cache misses occur.
摘要:
An age-based arbitration scheme for enforcing data coherency in an information handling system is disclosed. As loads and stores access a cache, if a cache miss occurs, a miss request is generated and tagged with the cycle or age in which the miss is detected. If a castout is required, it is also tagged with the cycle in which the load or store access occurred, and the line being replaced or cast out is marked as being invalid in that level of hierarchy. The arbitration rules for the next level of memory hierarchy are defined such that all requests that are generated during a particular cycle are given priority over all of the requests generated during any subsequent cycle. As a result, if a load miss occurs for a cache line which is present in the castout buffer, the castout request tagged with an earlier age will be arbitrated into the next memory hierarchy level prior to the arrival of the newly generated miss requests. The age-based arbitration scheme can also be used for multiple cache accesses occurring in parallel.
摘要:
A system and method for allowing operation of a storage array after a failure within a set of an n-way set associative cache includes determining that there is a failure in a bit line in the storage array, setting a flag to inhibit access to the portion of the array accessed by the failing entity and storing and retrieving data from remaining portions of the array. The present invention is well adapted for use with n-way set associative cache storage arrays.
摘要:
In a method and apparatus for allocating processor resources in a data processing system, instructions are dispatched and tagged for processing. A processor resource snoops to obtain execution results for the tagged instructions. Such an instruction is logically "finished" in response to determining that it will not cause an interrupt (which includes not changing the sequence of completing instructions), and "completed" in response to finishing all earlier dispatched instructions. Information is entered for such an instructions in rename buffer in response to the instruction targeting an architected register, and such a rename buffer entry is released in response to completing the entry's instruction. The rename buffer may comprise a history buffer. Also, information for the instructions is entered in a completion queue in response to dispatching the instructions, and the queue entry for such an instruction is released in response to completion of the instruction. Also, the instructions are grouped, a group having solely a single interruptible instruction, and further including non-interruptible instructions dispatched following the interruptible instruction. Thus, there may be numerous non-interruptible instructions in such a group. Such an interruptible instruction is logically "finished" in response to determining that it will not cause an interrupt, and "completed" in response to finishing all earlier dispatched instructions. Such a non-interruptible instruction is logically "finished" and "completed" in response to completion of its associated interruptible instruction, so that such a non-interruptible instruction may complete before it is dispatched.
摘要:
The present invention is directed to a method and apparatus for dispatching instructions in an information handling system. A pre-execution queue stores instructions, and at least one execution cluster is operably coupled to the pre-execution queue. An execution cluster comprises an early execution unit for executing a first instruction dispatched from the pre-execution queue to generate and forward a first result and a late execution unit for executing a second instruction dispatched from the pre-execution queue to generate and forward a second result after the first execution unit forwards the first result. The invention further includes circuitry operably associated with the pre-execution queue, and a method for prioritizing the order in which the instructions in the pre-execution queue are dispatched to the execution cluster.
摘要:
A method and apparatus for maintaining content of registers of a processor which uses the registers for processing instructions. Entries are stored in a buffer for restoring register content in response to an interruption by an interruptible instruction. Entries include information for reducing the number of entries selected for the restoring. A set of the buffer entries is selected, in response to the interruption and the information, for restoring register content. The set includes only entries which are necessary for restoring the content in response to the interruption so that the content of the processor registers may be restored in a single processor cycle, even if multiple entries are stored for a first one of the registers and multiple entries are stored for a second one of the registers.
摘要:
A method and system are disclosed for reducing run-time delay during conditional branch instruction execution in a pipelined processor system. A series of queued sequential instructions and conditional branch instructions are processed wherein each conditional branch instruction specifies an associated conditional branch to be taken in response to a selected outcome of processing one or more sequential instructions. Upon detection of a conditional branch instruction within the queue, a group of target instructions are fetched based upon a prediction that an associated conditional branch will be taken. Sequential instructions within the queue following the conditional branch instruction are then purged and the target instructions loaded into the queue only in response to a successful a retrieval of the target instructions, such that the sequential instructions may be processed without delay if the prediction that the conditional branch is taken proves invalid prior to retrieval of the target instructions. Alternately, the purged sequential instructions may be refetched after loading the target instructions such that the sequential instructions may be executed with minimal delay if the prediction that the conditional branch is taken proves invalid after loading the target instructions. In yet another embodiment, the sequential instructions within the queue following the conditional branch instruction are purged only in response to a successful retrieval of the target instructions and an imminent execution of the conditional branch instruction.
摘要:
A mechanism and method for a store data queue are implemented. Address translation operations for store instructions in a data processor are decoupled from data operations by initiating address translation before source data operands are available. A store data queue snoops the finish buses of execution units for the source operand. A data entry in the data queue that is allocated at dispatch of the store instruction snoops for the source operand required by its corresponding instruction. The data operand is then communicated to a memory device in instruction order thereby simplifying the detection of conflicting stores.
摘要:
In a superscalar processor implementing out-of-order dispatching and execution of load and store instructions, when a store instruction has already been translated, the load address range of a load instruction is contained within the address range of the store instruction, and the data associated with the store instruction is available, then the data associated with the store instruction is forwarded to the load instruction so that the load instruction may continue execution without having to be stalled or flushed.