摘要:
An aggregate symmetric multiprocessor (SMP) data processing system includes a first SMP computer including at least first and second processing units and a first system memory pool and a second SMP computer including at least third and fourth processing units and second and third system memory pools. The second system memory pool is a restricted access memory pool inaccessible to the fourth processing unit and accessible to at least the second and third processing units, and the third system memory pool is accessible to both the third and fourth processing units. An interconnect couples the second processing unit in the first SMP computer for load-store coherent, ordered access to the second system memory pool in the second SMP computer, such that the second processing unit in the first SMP computer and the second system memory pool in the second SMP computer form a synthetic third SMP computer.
摘要:
A processing unit for a data processing system includes a processor core having one or more execution units for processing instructions and a register file for storing data accessed in processing of the instructions. The processing unit also includes a multi-level cache hierarchy coupled to and supporting the processor core. The multi-level cache hierarchy includes at least one upper level of cache memory having a lower access latency and at least one lower level of cache memory having a higher access latency. The lower level of cache memory, responsive to receipt of a memory access request that hits only a partial cache line in the lower level cache memory, sources the partial cache line to the at least one upper level cache memory to service the memory access request. The at least one upper level cache memory services the memory access request without caching the partial cache line.
摘要:
A cache, system and method for reducing the number of rejected snoop requests. A “stall/reorder unit” in a cache receives a snoop request from an interconnect. Information, such as the address, of the snoop request is stored in a queue of the stall/reorder unit. The stall/reorder unit forwards the snoop request to a selector which also receives a request from a processor. An arbitration mechanism selects either the snoop request or the request from the processor. If the snoop request is denied by the arbitration mechanism, information, e.g., address, about the snoop request may be maintained in the stall/reorder unit. The request may be later resent to the selector. This process may be repeated up to “n” clock cycles. By providing the snoop request additional opportunities (n clock cycles) to be accepted by the arbitration mechanism, fewer snoop requests may ultimately be denied.
摘要:
In response to a data request of a first of a plurality of processing units, the first processing unit selects a victim cache line to be castout from the lower level cache of the first processing unit and determines whether a mode is set. If not, the first processing unit issues on the interconnect fabric an LCO command identifying the victim cache line and indicating that a lower level cache is the intended destination. If the mode is set, the first processing unit issues a castout command with an alternative intended destination. In response to a coherence response to the LCO command indicating success of the LCO command, the first processing unit removes the victim cache line from its lower level cache, and the victim cache line is held elsewhere in the data processing system. The mode can be set to inhibit castouts to system memory, for example, for testing.
摘要:
A cache coherent data processing system includes at least first and second coherency domains. In a first cache memory within the first coherency domain of the data processing system, a coherency state field associated with a storage location and an address tag is set to a first data-invalid coherency state that indicates that the address tag is valid and that the storage location does not contain valid data. In response to snooping an exclusive access operation, the exclusive access request specifying a target address matching the address tag and indicating a relative domain location of a requestor that initiated the exclusive access operation, the first cache memory updates the coherency state field from the first data-invalid coherency state to a second data-invalid coherency state that indicates that the address tag is valid, that the storage location does not contain valid data, and whether a target memory block associated with the address tag is cached within the first coherency domain upon successful completion of the exclusive access operation based upon the relative location of the requestor.
摘要:
A processing unit for a multiprocessor data processing system includes a processor core and a cache hierarchy coupled to the processor core to provide low latency data access. The cache hierarchy includes an upper level cache coupled to the processor core and a lower level victim cache coupled to the upper level cache. In response to a prefetch request of the processor core that misses in the upper level cache, the lower level victim cache determines whether the prefetch request misses in the directory of the lower level victim cache and, if so, allocates a state machine in the lower level victim cache that services the prefetch request by issuing the prefetch request to at least one other processing unit of the multiprocessor data processing system.
摘要:
A data processing system includes a processor core having an associated upper level cache and a lower level victim cache. In response to a memory access request of the processor core, the lower level cache victim determines whether the memory access request hits or misses in the directory of the lower level victim cache, and the upper level cache determines whether a castout from the upper level cache is to be performed and selects a victim coherency granule for eviction from the upper level cache. In response to determining that a castout from the upper level cache is to be performed, the upper level cache evicts the selected victim coherency granule. In the eviction, the upper level cache reads out the victim coherency granule from the data array of the upper level cache only in response to an indication that the memory access request misses in the directory of the lower level victim cache.
摘要:
Methods, systems, and computer program products are provided for dynamic selective memory mirroring in solid state devices. An amount of memory is reserved. Sections of the memory to select for mirroring in the reserved memory are dynamically determined. The selected sections of the memory contain critical areas. The selected sections of the memory are mirrored in the reserved memory.
摘要:
A cache, system and method for reducing the number of rejected snoop requests. A “stall/reorder unit” in a cache receives a snoop request from an interconnect. Information, such as the address, of the snoop request is stored in a queue of the stall/reorder unit. The stall/reorder unit forwards the snoop request to a selector which also receives a request from a processor. An arbitration mechanism selects either the snoop request or the request from the processor. If the snoop request is denied by the arbitration mechanism, information, e.g., address, about the snoop request may be maintained in the stall/reorder unit. The request may be later resent to the selector. This process may be repeated up to “n” clock cycles. By providing the snoop request additional opportunities (n clock cycles) to be accepted by the arbitration mechanism, fewer snoop requests may ultimately be denied.
摘要:
Scrubbing logic in a local coherency domain issues a domain query request to at least one cache hierarchy in a remote coherency domain. The domain query request is a non-destructive probe of a coherency state associated with a target memory block by the at least one cache hierarchy. A coherency response to the domain query request is received. In response to the coherency response indicating that the target memory block is not cached in the remote coherency domain, a domain indication in the local coherency domain is updated to indicate that the target memory block is cached, if at all, only within the local coherency domain.