摘要:
According to one embodiment, a method of coherency management in a data processing system includes holding a cache line in an upper level cache memory in an exclusive ownership coherency state and thereafter removing the cache line from the upper level cache memory and transmitting a castout request for the cache line from the upper level cache memory to a lower level cache memory. The castout request includes an indication of a shared ownership coherency state. In response to the castout request, the cache line is placed in the lower level cache memory in a coherency state determined in accordance with the castout request.
摘要:
Scrubbing logic in a local coherency domain issues to at least one cache hierarchy in a remote coherency domain a domain reset request that forces invalidation of any cached copy of a target memory block then held in said remote coherency domain. A coherency response to said domain reset request is received. In response to said coherency response indicating that said target memory block is not cached in said remote coherency domain, a domain indication of said local coherency domain is updated to indicate that said target memory block is cached, if at all, only within said local coherency domain.
摘要:
A cache coherent data processing system includes at least first and second coherency domains. The first coherency domain includes a system memory controller for a system memory and a first processing unit having a first cache memory. The second coherency domain includes a second processing unit having a second cache memory. In the first cache memory, a coherency state field associated with a storage location and an address tag is set to a first coherency state. In response to snooping an exclusive access request specifying a target address matching the address tag, the first cache memory provides a first partial response to the exclusive access request based at least in part upon the first coherency state. In response to snooping the exclusive access request, the memory controller determines whether it is responsible for the target address and provides a second partial response to the exclusive access request based at least in part upon an outcome of the determination. At least the first and second partial responses are accumulated to obtain a combined response for the exclusive access request. The combined response includes an indication of whether or not a highest point of coherency and a memory controller of a home system memory for the target address reside within a same coherency domain. The first cache memory updates the coherency state field from the first coherency state to a second coherency state in response to the indication in the combined response.
摘要:
In response to execution of program code, a control register within scrubbing logic in a local coherency domain is initialized with at least a target address of a target memory block. In response to the initialization, the scrubbing logic issues to at least one cache hierarchy in a remote coherency domain a domain indication scrubbing request targeting a target memory block that may be cached by the at least one cache hierarchy. In response to receipt of a coherency response indicating that the target memory block is not cached in the remote coherency domain, a domain indication in the local coherency domain is updated to indicate that the target memory block is cached, if at all, only within the local coherency domain.
摘要:
In a data processing system, a plurality of agents communicate operations therebetween. Each operation includes a request and a combined response representing a system-wide response to the request. Latencies of requests and combined responses between the plurality of agents are observed. Each of the plurality of agents is configured with a respective duration of a protection window extension by reference to the observed latencies. Each protection window extension is a period following receipt of a combined response during winch an associated one of the plurality of agents protects transfer of coherency ownership of a data granule between agents. The plurality of agents employing protection window extensions in accordance with the configuration, and at least two of the agents have protection window extensions of differing durations.
摘要:
Scrubbing logic in a local coherency domain issues a domain query request to at least one cache hierarchy in a remote coherency domain. The domain query request is a non-destructive probe of a coherency state associated with a target memory block by the at least one cache hierarchy. A coherency response to the domain query request is received. In response to the coherency response indicating that the target memory block is not cached in the remote coherency domain, a domain indication in the local coherency domain is updated to indicate that the target memory block is cached, if at all, only within the local coherency domain.
摘要:
A data processing system includes a plurality of communication links and a plurality of processing units including a local master processing unit. The local master processing unit includes interconnect logic that couples the processing unit to one or more of the plurality of communication links and an originating master coupled to the interconnect logic. The originating master originates an operation by issuing a write-type request on at least one of the one or more communication links, receives from a snooper in the data processing system a destination tag identifying a route to the snooper, and, responsive to receipt of the combined response and the destination tag, initiates a data transfer including a data payload and a data tag identifying the route provided within the destination tag.
摘要:
A data processing system includes a plurality of communication links and a plurality of processing units including a local master processing unit. The local master processing unit includes interconnect logic that couples the processing unit to one or more of the plurality of communication links and an originating master coupled to the interconnect logic. The originating master originates an operation by issuing a write-type request on at least one of the one or more communication links, receives from a snooper in the data processing system a destination tag identifying a route to the snooper, and, responsive to receipt of the combined response and the destination tag, initiates a data transfer including a data payload and a data tag identifying the route provided within the destination tag.
摘要:
A data processing system includes at least a first processing node having an input/output (I/O) controller and a second processing including a memory controller for a memory. The memory controller receives, in order, pipelined first and second DMA write operations from the I/O controller, where the first and second DMA write operations target first and second addresses, respectively. In response to the second DMA write operation, the memory controller establishes a state of a domain indicator associated with the second address to indicate an operation scope including the first processing node. In response to the memory controller receiving a data access request specifying the second address and having a scope excluding the first processing node, the memory controller forces the data access request to be reissued with a scope including the first processing node based upon the state of the domain indicator associated with the second address.
摘要:
A data processing system includes a plurality of processing units each having a respective point-to-point communication link with each of multiple others of the plurality of processing units but fewer than all of the plurality of processing units. Each of the plurality of processing units includes interconnect logic, coupled to each point-to-point communication link of that processing unit, that broadcasts requests received from one of the multiple others of the plurality of processing units to one or more of the plurality of processing units. The interconnect logic includes a partial response data structure including a plurality of entries each associating a partial response field with a plurality of flags respectively associated with each processing unit containing a snooper from which that processing unit will receive a partial response. The interconnect logic accumulates partial responses of processing units by reference to the partial response field to obtain an accumulated partial response, and when the plurality of flags indicate that all processing units from which partial responses are expected have returned a partial response, outputs the accumulated partial response.