摘要:
A processing unit for a data processing system includes a processor core having one or more execution units for processing instructions and a register file for storing data accessed in processing of the instructions. The processing unit also includes a multi-level cache hierarchy coupled to and supporting the processor core. The multi-level cache hierarchy includes at least one upper level of cache memory having a lower access latency and at least one lower level of cache memory having a higher access latency. The lower level of cache memory, responsive to receipt of a memory access request that hits only a partial cache line in the lower level cache memory, sources the partial cache line to the at least one upper level cache memory to service the memory access request. The at least one upper level cache memory services the memory access request without caching the partial cache line.
摘要:
In response to a data request of a first of a plurality of processing units, the first processing unit selects a victim cache line to be castout from the lower level cache of the first processing unit and determines whether a mode is set. If not, the first processing unit issues on the interconnect fabric an LCO command identifying the victim cache line and indicating that a lower level cache is the intended destination. If the mode is set, the first processing unit issues a castout command with an alternative intended destination. In response to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from its lower level cache, and the victim cache line is held elsewhere in the data processing system. The mode can be set to inhibit castouts to system memory, for example, for testing.
摘要:
A data processing system includes a processor core having an associated upper level cache and a lower level victim cache. In response to a memory access request of the processor core, the lower level cache victim determines whether the memory access request hits or misses in the directory of the lower level victim cache, and the upper level cache determines whether a castout from the upper level cache is to be performed and selects a victim coherency granule for eviction from the upper level cache. In response to determining that a castout from the upper level cache is to be performed, the upper level cache evicts the selected victim coherency granule. In the eviction, the upper level cache reads out the victim coherency granule from the data array of the upper level cache only in response to an indication that the memory access request misses in the directory of the lower level victim cache.
摘要:
In response to executing a deallocate instruction, a deallocation request specifying a target address of a target cache line is sent from a processor core to a lower level cache. In response, a determination is made if the target address hits in the lower level cache. If so, the target cache line is retained in a data array of the lower level cache, and a replacement order field of the lower level cache is updated such that the target cache line is more likely to be evicted in response to a subsequent cache miss in a congruence class including the target cache line. In response to the subsequent cache miss, the target cache line is cast out to the lower level cache with an indication that the target cache line was a target of a previous deallocation request of the processor core.
摘要:
A data processing system includes a processor core supported by upper and lower level caches. In response to executing a deallocate instruction in the processor core, a deallocation request is sent from the processor core to the lower level cache, the deallocation request specifying a target address associated with a target cache line. In response to receipt of the deallocation request at the lower level cache, a determination is made if the target address hits in the lower level cache. In response to determining that the target address hits in the lower level cache, the target cache line is retained in a data array of the lower level cache and a replacement order field in a directory of the lower level cache is updated such that the target cache line is more likely to be evicted from the lower level cache in response to a subsequent cache miss.
摘要:
A victim cache memory includes a cache array, a cache directory of contents of the cache array, and a cache controller that controls operation of the victim cache memory. The cache controller, responsive to receiving a castout command identifying a victim cache line castout from another cache memory, causes the victim cache line to be held in the cache array. If the other cache memory is a higher level cache in the cache hierarchy of the processor core, the cache controller marks the victim cache line in the cache directory so that it is less likely to be evicted by a replacement policy of the victim cache, and otherwise, marks the victim cache line in the cache directory so that it is more likely to be evicted by the replacement policy of the victim cache.
摘要:
A victim cache line having a data-invalid coherence state is selected for castout from a first lower level cache of a first processing unit. The first processing unit issues on an interconnect fabric a lateral castout (LCO) command identifying the victim cache line to be castout from the first lower level cache, indicating the data-invalid coherence state, and indicating that a lower level cache is an intended destination of the victim cache line. In response to a coherence response to the LCO command indicating success of the LCO command, the victim cache line is removed from the first lower level cache and held in a second lower level cache of a second processing unit in the data-invalid coherence state.
摘要:
An LBIST captures pseudo-random values from a pseudo-random pattern generator. Next, the LBIST stabilizes an untimed logic path by inputting the captured pseudo-random value into the untimed logic path. In turn, the LBIST tests one or more timed signal transitions that are dependent upon the stabilized untimed logic path.
摘要:
A multiprocessor data processing system includes at least first and second coherency domains, where the first coherency domain includes a system memory and a cache memory. According to a method of data processing, a cache line is buffered in a data array of the cache memory and a state field in a cache directory of the cache memory is set to a coherency state to indicate that the cache line is valid in the data array, that the cache line is held in the cache memory non-exclusively, and that another cache in said second coherency domain may hold a copy of the cache line.
摘要:
A method and computer system for reducing the wiring congestion, required real estate, and access latency in a cache subsystem with a sectored and sliced lower cache by re-configuring sector-to-slice allocation and the lower cache addressing scheme. With this allocation, sectors having discontiguous addresses are placed within the same slice, and a reduced-wiring scheme is possible between two levels of lower caches based on this re-assignment of the addressable sectors within the cache slices. Additionally, the lower cache effective address tag is re-configured such that the address fields previously allocated to identifying the sector and the slice are switched relative to each other's location within the address tag. This re-allocation of the address bits enables direct slice addressing based on the indicated sector.