摘要:
A system comprises a first node including data having an associated D-state and a second node operative to provide a source broadcast requesting the data. The first node is operative in response to the source broadcast to provide the data to the second node and transition the state associated with the data at the first node from the D-state to an O-state without concurrently updating memory. An S-state is associated with the data at the second node.
摘要:
A system for adaptively bypassing one or more higher cache levels following a miss in a lower level of a cache hierarchy is described. Each cache level preferably includes a tag store containing address and state information for each cache line resident in the respective cache. When an invalidate request is received at a given cache hierarchy, each cache level is searched for the address specified by the invalidate request. When an address match is detected, the state of the respective cache line is changed to the invalid state, although the address of the cache line is left in the tag store. Thereafter, if the processor or entity associated with this cache hierarchy issues its own request for this same cache line, the cache hierarchy begins searching the tag store of each level starting with the lowest cache level. Since the address of the invalidated cache line was left in the respective tag store, a match will be detected at one of the cache levels, although the corresponding state of this cache line is invalid. This condition is specifically detected and is considered to be an “inval_miss” occurrence. In response, to an inval_miss, the cache hierarchy calls off searching any higher levels, and instead, issues a memory reference request for the desired cache line. In a further embodiment, the entity that sourced an invalidate request is stored, and a subsequent memory reference request for the same cache line is sent directly to the source entity.
摘要:
A next line prediction mechanism for predicting a next instruction index to an instruction cache of a computer pipeline, has a latency equal to the cycle time of the instruction cache to maximize the instruction bandwidth out of the instruction cache. The instruction cache outputs a block of instructions with each fetch initiated by a next instruction index provided by the line prediction mechanism. The instructions of the block are processed in parallel for instruction decode and branch prediction to maintain a high rate of instruction flow through the pipeline.
摘要:
A register map having a free list of available physical locations in a register file, a log containing a sequential listing of logical registers changed during a predetermined number of cycles, a back-up map associating the logical registers with corresponding physical homes at a back-up point in a computer pipeline operation and a predicted map associating the logical registers with corresponding physical homes at a current point in the computer pipeline operation. A set of valid bits is associated with the maps to indicate whether a particular logical register is to be taken from the back-up map or the predicted map indication of a corresponding physical home. The valid bits can be "flash cleared" in a single cycle to back-up the computer pipeline to the back-up point during a trap event.
摘要:
A method and arrangement for producing a predicted subroutine return address in response to entry of a subroutine return instruction in a computer pipeline that has a ring pointer counter and a ring buffer coupled to the ring pointer counter. The ring pointer counter contains a ring pointer that is changed when either a subroutine call instruction or return instruction enters the computer pipeline. The ring buffer has buffer locations which store a value present at its input into the buffer location pointed to by the ring pointer when a subroutine call instruction enters the pipeline. The ring buffer provides a value from the buffer location pointed to by the ring pointer when a subroutine return instruction enters the computer pipeline, this provided value being the predicted subroutine return address.
摘要:
Multi-processor systems and methods are disclosed. One embodiment may comprise a multi-processor system including a processor that executes program instructions across at least one memory barrier. A request engine may provide an updated data fill corresponding to an invalid cache line. The invalid cache line may be associated with at least one executed load instruction. A load compare component may compare the invalid cache line to the updated data fill to evaluate the consistency of the at least one executed load instruction.
摘要:
A system comprises a first node that employs a source broadcast protocol to initiate a transaction. The first node employs a forward progress protocol to resolve the transaction if the source broadcast protocol cannot provide a deterministic resolution of the transaction.
摘要:
Multi-processor systems and methods are disclosed. One embodiment may comprise a multi-processor system comprising a processor having a processor pipeline that executes program instructions with data from a speculative fill that is provided in response to a source request, and a backup system that retains information associated with a previous processor execution state corresponding to an instruction associated with the speculative fill. The backup system may initiate a backup of the processor pipeline to the previous processor execution state if the speculative fill is determined to be non-coherent, and the processor pipeline may continue execution of program instructions if the speculative fill is determined to be coherent.
摘要:
Systems and methods are disclosed for interaction between different cache coherency protocols. One system may comprise a home node that receives a request for data from a first node in a first cache coherency protocol. A second node provides a conflict response to a request for the data from the home node. The conflict response indicates that an ordering point for the data is migrating according to a second cache coherency protocol, which is different from the first cache coherency protocol.
摘要:
Multi-processor systems and methods are disclosed. One embodiment may comprise a multi-processor system comprising at least one data fill provided to a source processor in response to a source request by the source processor, and a coherent signal generated by the multi-processor system that provides an indication of which data fill of the at least one data fill is a coherent data fill.