摘要:
A multiprocessor data processing system requires careful management to maintain cache coherency. Conventional systems using a MESI approach sacrifice some performance with inefficient lock-acquisition and lock-retention techniques. The disclosed system provides additional cache states, indicator bits, and lock-acquisition routines to improve cache performance. In particular, as multiple processors compete for the same cache line, a significant amount of processor time is lost determining if another processor's cache line lock has been released and attempting to reserve that cache line while it is still owned by the other processor. The preferred embodiment provides an indicator bit with the cache store command which specifically indicates whether the store also acts as a lock-release.
摘要:
A novel cache coherency protocol provides a modified-unsolicited (Mu) cache state to indicate that a value held in a cache line has been modified (i.e., is not currently consistent with system memory), but was modified by another processing unit, not by the processing unit associated with the cache that currently contains the value in the Mu state, and that the value is held exclusive of any other horizontally adjacent caches. Because the value is exclusively held, it may be modified in that cache without the necessity of issuing a bus transaction to other horizontal caches in the memory hierarchy. The Mu state may be applied as a result of a snoop response to a read request. The read request can include a flag to indicate that the requesting cache is capable of utilizing the Mu state. Alternatively, a flag may be provided with intervention data to indicate that the requesting cache should utilize the modified-unsolicited state.
摘要:
A novel cache coherency protocol provides a modified-unsolicited (MU) cache state to indicate that a value held in a cache line has been modified (i.e., is not currently consistent with system memory), but was modified by another processing unit, not by the processing unit associated with the cache that currently contains the value in the MU state, and that the value is held exclusive of any other horizontally adjacent caches. Because the value is exclusively held, it may be modified in that cache without the necessity of issuing a bus transaction to other horizontal caches in the memory hierarchy. The MU state may be applied as a result of a snoop response to a read request. The read request can include a flag to indicate that the requesting cache is capable of utilizing the MU state. Alternatively, a flag may be provided with intervention data to indicate that the requesting cache should utilize the modified-unsolicited state.
摘要:
A technique for maintaining input/output (I/O) command ordering on a bus includes assigning a channel identifier to I/O commands of an I/O stream. In this case, the channel identifier indicates the I/O commands belong to the I/O stream. A command location indicator is assigned to each of the I/O commands. The command location indicator provides an indication of which one of the I/O commands is a start command in the I/O stream and which of the I/O commands are continue commands in the I/O stream. The I/O commands are issued in a desired completion order. When a first one of the I/O commands does not complete successfully, the I/O commands in the I/O stream are reissued on the bus starting at the first one of the I/O commands that did not complete successfully.
摘要:
A computer implemented method, a processor chip, a data processing system, and computer program product in a data processing system process information in a store cache of a data processing system. The store cache receives a first entry that includes a first address indicating a first segment of a cache line. The store cache then receives a second entry including a second address indicating a second segment of the cache line. Responsive to the first segment not being equal to the second segment, the first entry is chained to the second entry.
摘要:
An information handling system (IHS) includes a processor with a cache memory system. The processor includes a processor core with an L1 cache memory that couples to an L2 cache memory. The processor includes an arbitration mechanism that controls load and store requests to the L2 cache memory. The arbitration mechanism includes control logic that enables a load request to interrupt a store request that the L2 cache memory is currently servicing. When the L2 cache memory finishes servicing the interrupting load request, the L2 cache memory may return to servicing the interrupted store request at the point of interruption. The control logic determines the size requirement of each load operation or store operation. When the cache memory system performs a store operation or load operation, the memory system accesses the portion of a cache line it needs to perform the operation instead of accessing an entire cache line.
摘要:
An information handling system (IHS) includes a processor with a cache memory system. The processor includes a processor core with an L1 cache memory that couples to an L2 cache memory. The processor includes an arbitration mechanism that controls load and store requests to the L2 cache memory. The arbitration mechanism includes control logic that enables a load request to interrupt a store request that the L2 cache memory is currently servicing. The L2 cache memory includes dual data banks so that one bank may perform a load operation while the other bank performs a store operation. The cache system provides dual dispatch points into the data flow to the dual cache banks of the L2 cache memory.
摘要:
A system and method for tracking core load requests and providing arbitration and ordering of requests. When a core interface unit (CIU) receives a load operation from the processor core, a new entry in allocated in a queue of the CIU. In response to allocating the new entry in the queue, the CIU detects contention between the load request and another memory access request. In response to detecting contention, the load request may be suspended until the contention is resolved. Received load requests may be stored in the queue and tracked using a least recently used (LRU) mechanism. The load request may then be processed when the load request resides in a least recently used entry in the load request queue. CIU may also suspend issuing an instruction unless a read claim (RC) machine is available. In another embodiment, CIU may issue stored load requests in a specific priority order.
摘要:
A cache memory logically associates a cache line with at least two cache sectors of a cache array wherein different sectors have different output latencies and, for a load hit, selectively enables the cache sectors based on their latency to output the cache line over successive clock cycles. Larger wires having a higher transmission speed are preferably used to output the cache line corresponding to the requested memory block. In the illustrative embodiment the cache is arranged with rows and columns of the cache sectors, and a given cache line is spread across sectors in different columns, with at least one portion of the given cache line being located in a first column having a first latency, and another portion of the given cache line being located in a second column having a second latency greater than the first latency. One set of wires oriented along a horizontal direction may be used to output the cache line, while another set of wires oriented along a vertical direction may be used for maintenance of the cache sectors. A given cache line is further preferably spread across sectors in different rows or cache ways. For example, a cache line can be 128 bytes and spread across four sectors in four different columns, each sector containing 32 bytes of the cache line, and the cache line is output over four successive clock cycles with one sector being transmitted during each of the four cycles.
摘要:
A system and method for cache management in a data processing system having a memory hierarchy of upper memory and lower memory cache. A lower memory cache controller accesses a coherency state table to determine replacement policies of coherency states for cache lines present in the lower memory cache when receiving a cast-in request from one of the upper memory caches. The coherency state table implements a replacement policy that retains the more valuable cache coherency state information between the upper and lower memory caches for a particular cache line contained in both levels of memory at the time of cast-out from the upper memory cache.