摘要:
A method and system of facilitating storage accesses within a multiprocessor system subsequent to a synchronization instruction by a local processor consists of determining if data for the storage accesses is cacheable and if there is a “hit” in a cache. If both conditions are met, the storage accesses return the data to the local processor. The storage accesses have an entry on an interrupt table which is used to discard the returned data if a snoop kills the line before the synchronization instruction completes. After the cache returns data, a return data bit is set in the interrupt table. A snoop killing the line sets a snooped bit in the interrupt table. Upon completion of the synchronization instruction, any entries in the interrupt table subsequent to the synchronization instruction that have the return data bit and snooped bit set are flushed. The flush occurs because the data returned to the local processor due to a “cacheable hit” subsequent to the synchronization instruction was out of order with the snoop and the processor must flush the data and go back out to the system bus for the new data.
摘要:
An apparatus and method for processing multiple cache misses to a single cache line in an information handling system which includes a miss queue for storing requests for data not located in a level one cache and a comparator for comparing requests for data stored in the miss queue to determine if there are multiple requests for data located in the same cache line of a level two cache. Each new request for data from the same cache line of the level two cache as an older original request for data in the miss queue is marked as a load hit reload. The requests marked as load hit reloads are then grouped together with the matching original request and forwarded together to the level two cache wherein the original request requests the data from level two cache. The load hit reload requests do not access level two cache but instead bypass access of level two cache by extracting data from the cache line outputted from level two cache for the matching original request. The present invention reduces the number of accesses to the level two cache and allows data requests to be satisfied in parallel versus serially when multiple successive level one cache misses occur.
摘要:
A data processor assigns a unique identifier to each instruction. As there are a finite number of unique identifiers, the identifiers are reused during execution of a program within the data processing system. To maintain an age relationship between instructions executing in the pipeline processor, a methodology is developed to ensure that reused identifiers are properly designated as being younger than their older but larger in magnitude, counterparts. To resolve this issue, assume that the identifier assigned to each instruction has N bits, and therefore, there are 2.sup.N identifiers to be assigned to instructions in the program. The 2.sup.N identifiers are separated into 2.sup.m banks. In addition to assigning identifiers to each instruction, an identifier assignment logic circuit within the pipeline processor provides a global signal that indicates which bank is a youngest bank from which the identifiers are assigned to a remaining portion of the pipeline processor. The global signal preconditions portions of the two identifiers being compared. Subsequently, a result of this conditioning is concatenated with a remaining portion of a selected identifier. The modification of the upper bits of the identifier maintains a relative age position for the identifiers and their associated instructions in the pipelined processor.
摘要:
An age-based arbitration scheme for enforcing data coherency in an information handling system is disclosed. As loads and stores access a cache, if a cache miss occurs, a miss request is generated and tagged with the cycle or age in which the miss is detected. If a castout is required, it is also tagged with the cycle in which the load or store access occurred, and the line being replaced or cast out is marked as being invalid in that level of hierarchy. The arbitration rules for the next level of memory hierarchy are defined such that all requests that are generated during a particular cycle are given priority over all of the requests generated during any subsequent cycle. As a result, if a load miss occurs for a cache line which is present in the castout buffer, the castout request tagged with an earlier age will be arbitrated into the next memory hierarchy level prior to the arrival of the newly generated miss requests. The age-based arbitration scheme can also be used for multiple cache accesses occurring in parallel.
摘要:
In an example, a synchronization signal can be transmitted to a plurality of synchronizers. The plurality of synchronizers can include a plurality of upstream synchronizers and a downstream synchronizer. Each synchronizer of the plurality of upstream synchronizers can be caused to count from a respective count value until a predetermined end count sequence value in response to receiving the synchronization signal. The respective count value stored at each synchronizer can be representative of a difference in time between a respective upstream synchronizer of the plurality of upstream synchronizers receiving the synchronization signal and the downstream synchronizer receiving the synchronization signal. A respective processing element of a plurality of processing elements can be caused to start a respective function or operation in response to a respective upstream synchronizer reaching the predetermined end count sequence value.
摘要:
A pointer is for pointing to a next-to-read location within a stack of information. For pushing information onto the stack: a value is saved of the pointer, which points to a first location within the stack as being the next-to-read location; the pointer is updated so that it points to a second location within the stack as being the next-to-read location; and the information is written for storage at the second location. For popping the information from the stack: in response to the pointer, the information is read from the second location as the next-to-read location; and the pointer is restored to equal the saved value so that it points to the first location as being the next-to-read location.
摘要:
Methods for storing branch information in an address table of a processor are disclosed. A processor of the disclosed embodiments may generally include an instruction fetch unit connected to an instruction cache, a branch execution unit, and an address table being connected to the instruction fetch unit and the branch execution unit. The address table may generally be adapted to store a plurality of entries with each entry of the address table being adapted to store a base address and a base instruction tag. In a further embodiment, the branch execution unit may be adapted to determine the address of a branch instruction having an instruction tag based on the base address and the base instruction tag of an entry of the address table associated with the instruction tag. In some embodiments, the address table may further be adapted to store branch information.
摘要:
In a branch instruction target address cache, an entry associated with a fetched block of instructions includes a target address of a branch instruction residing in the next sequential block of instructions. The entry will include a sequential address associated with the branch instruction and a prediction of whether the target address is taken or not taken.
摘要:
Methods for storing branch information in an address table of a processor are disclosed. A processor of the disclosed embodiments may generally include an instruction fetch unit connected to an instruction cache, a branch execution unit, and an address table being connected to the instruction fetch unit and the branch execution unit. The address table may generally be adapted to store a plurality of entries with each entry of the address table being adapted to store a base address and a base instruction tag. In a further embodiment, the branch execution unit may be adapted to determine the address of a branch instruction having an instruction tag based on the base address and the base instruction tag of an entry of the address table associated with the instruction tag. In some embodiments, the address table may further be adapted to store branch information.
摘要:
Method, system and computer program product for determining the targets of branches in a data processing system. A method for determining the target of a branch in a data processing system includes performing at least one pre-calculation relating to determining the target of the branch prior to writing the branch into a Level 1 (L1) cache to provide a pre-decoded branch, and then writing the pre-decoded branch into the L1 cache. By pre-calculating matters relating to the targets of branches before the branches are written into the L1 cache, for example, by re-encoding relative branches as absolute branches, a reduction in branch redirect delay can be achieved, thus providing a substantial improvement in overall processor performance.