摘要:
In a cache coherent data processing system including at least first and second coherency domains, a memory block is stored in a system memory in association with a domain indicator indicating whether or not the memory block is cached, if at all, only within the first coherency domain. A master in the first coherency domain determines whether or not a scope of broadcast transmission of an operation should extend beyond the first coherency domain by reference to the domain indicator stored in the cache and then performs a broadcast of the operation within the cache coherent data processing system in accordance with the determination.
摘要:
A data processing system includes at least first and second coherency domains coupled by an interconnect fabric. A memory coupled to the interconnect fabric includes an address translation table having a translation table entry utilized to translate virtual memory addresses to real memory addresses. The translation table entry also includes scope information for broadcast operations targeting addresses within a memory region associated with the translation table entry. Scope prediction logic within the first coherency domain predictively selects a scope of broadcast of an operation on an interconnect fabric of the data processing system by reference to the scope information within the address translation table entry.
摘要:
A data processing system includes at least a first processing node having an input/output (I/O) controller and a second processing including a memory controller for a memory. The memory controller receives, in order, pipelined first and second DMA write operations from the I/O controller, where the first and second DMA write operations target first and second addresses, respectively. In response to the second DMA write operation, the memory controller establishes a state of a domain indicator associated with the second address to indicate an operation scope including the first processing node. In response to the memory controller receiving a data access request specifying the second address and having a scope excluding the first processing node, the memory controller forces the data access request to be reissued with a scope including the first processing node based upon the state of the domain indicator associated with the second address.
摘要:
In response to execution of program code, a control register within scrubbing logic in a local coherency domain is initialized with at least a target address of a target memory block. In response to the initialization, the scrubbing logic issues to at least one cache hierarchy in a remote coherency domain a domain indication scrubbing request targeting a target memory block that may be cached by the at least one cache hierarchy. In response to receipt of a coherency response indicating that the target memory block is not cached in the remote coherency domain, a domain indication in the local coherency domain is updated to indicate that the target memory block is cached, if at all, only within the local coherency domain.
摘要:
In response to a master receiving a memory access request indicating a target address, the master accesses a first cache directory of an upper level cache of a cache hierarchy. In response to the target address being associated in the first cache directory with an entry having a valid address tag and a first invalid coherency state, the master issues a request specifying the target address on an interconnect fabric without regard to a coherency state associated with the target address in a second cache directory of a lower level cache of the cache hierarchy. In response to the target address having a second invalid coherency state with respect to the first cache directory, the master issues a request specifying the target address on an interconnect fabric after determining a coherency state associated with the target address in the second cache directory of the lower level cache of the cache hierarchy.
摘要:
A cache, system and method for reducing the number of rejected snoop requests. An incoming snoop request is entered in the first available latch in a pipeline of latches in a stall/reorder unit if the stall/reorder unit is not full. The entered snoop request is dispatched to a selector upon entering a bottom latch in the pipeline. The stall/reorder unit is not informed as to whether the dispatched snoop request is accepted by an arbitration mechanism for several clock cycles after the dispatch occurred. A copy of the dispatched snoop request is stored in a top latch in an overrun pipeline of latches in the first unit upon dispatching the snoop request. By maintaining information about the snoop request, the snoop request may be dispatched again to the selector in case the dispatched snoop request was rejected thereby increasing the chance that the snoop request will ultimately be accepted.
摘要:
A cache coherent data processing system includes a memory and at least first and second coherency domains that each include a respective one of first and second cache memories. A master in the first coherency domain selects a scope of an initial broadcast of an operation targeting a request address allocated to the memory from among a first scope including only the first coherency domain and a second scope including both the first and second coherency domains. The master selects the scope based, at least in part, upon whether the memory belongs to the first coherency domain and performs an initial broadcast of the operation within the cache coherent data processing system utilizing the selected scope.
摘要:
A data processing system includes a plurality of processing units each having a respective point-to-point communication link with each of multiple others of the plurality of processing units but fewer than all of the plurality of processing units. Each of the plurality of processing units includes interconnect logic, coupled to each point-to-point communication link of that processing unit, that broadcasts operations received from one of the multiple others of the plurality of processing units to one or more of the plurality of processing units.
摘要:
A processing unit for a multiprocessor data processing system includes a store-through upper level cache, an instruction sequencing unit that fetches instructions for execution, at least one instruction execution unit that executes a store-conditional instruction to determine a store target address, a store queue that, following execution of the store-conditional instruction, buffers a corresponding store operation, sequencer logic associated with the store queue. The sequencer logic, responsive to receipt of a latency indication indicating that resolution of the store-conditional operation as passing or failing is subject to significant latency, invalidates, prior to resolution of the store-conditional operation, a cache line in the store-through upper level cache to which a load-reserve operation previously bound.
摘要:
A cache coherent data processing system includes at least first and second coherency domains each including at least one processing unit and a cache memory. The cache memory includes a cache controller, a data array including a data storage location for caching a memory block, and a cache directory. The cache directory includes a tag field for storing an address tag in association with the memory block and a coherency state field associated with the tag field and the data storage location. The coherency state field has a plurality of possible states including a state that indicates that the address tag is valid, that the storage location does not contain valid data, and that the memory block is possibly cached outside of the first coherency domain.