摘要:
A cache coherent data processing system includes a memory and at least first and second coherency domains that each include a respective one of first and second cache memories. A master in the first coherency domain selects a scope of an initial broadcast of an operation targeting a request address allocated to the memory from among a first scope including only the first coherency domain and a second scope including both the first and second coherency domains. The master selects the scope based, at least in part, upon whether the memory belongs to the first coherency domain and performs an initial broadcast of the operation within the cache coherent data processing system utilizing the selected scope.
摘要:
A method and apparatus for performing data prefetch in a multiprocessor system are disclosed. The multiprocessor system includes multiple processors, each having a cache memory. The cache memory is subdivided into multiple slices. A group of prefetch requests is initially issued by a requesting processor in the multiprocessor system. Each prefetch request is intended for one of the respective slices of the cache memory of the requesting processor. In response to the prefetch requests being missed in the cache memory of the requesting processor, the prefetch requests are merged into one combined prefetch request. The combined prefetch request is then sent to the cache memories of all the non-requesting processors within the multiprocessor system. In response to a combined clean response from the cache memories of all the non-requesting processors, data are then obtained for the combined prefetch request from a system memory.
摘要:
A data processing system includes a plurality of processing units coupled by a plurality of communication links for point-to-point communication such that at least some of the communication between multiple different ones of the processing units is transmitted via intermediate processing units among the plurality of processing units. The communication includes operations having a request and a combined response representing a system response to the request. At least each intermediate processing unit includes one or more masters that initiate first operations, a snooper that receives at least second operations initiated by at least one other of the plurality of processing units, a physical queue that stores master tags of first operations initiated by the one or more masters within that processing unit, and a ticketing mechanism that assigns to second operations observed at the intermediate processing unit a ticket number indicating an order of observation with respect to other second operations observed by the intermediate processing unit. The ticketing mechanism provides the ticket number assigned to an operation to the snooper for processing with a combined response of the operation.
摘要:
A cache coherent data processing system includes at least first and second coherency domains each including at least one processing unit. The first coherency domain includes a first cache memory and a second cache memory, and the second coherency domain includes a remote coherent cache memory. The first cache memory includes a cache controller, a data array including a data storage location for caching a memory block, and a cache directory. The cache directory includes a tag field for storing an address tag in association with the memory block and a coherency state field associated with the tag field and the data storage location. The coherency state field has a plurality of possible states including a state that indicates that the memory block is possibly shared with the second cache memory in the first coherency domain and cached only within the first coherency domain.
摘要:
A cache coherent data processing system includes at least first and second coherency domains each including at least one processing unit. The first coherency domain includes a first cache memory and a second cache memory, and the second coherency domain includes a remote coherent cache memory. The first cache memory includes a cache controller, a data array including a data storage location for caching a memory block, and a cache directory. The cache directory includes a tag field for storing an address tag in association with the memory block and a coherency state field associated with the tag field and the data storage location. The coherency state field has a plurality of possible states including a state that indicates that the memory block is possibly shared with the second cache memory in the first coherency domain and cached only within the first coherency domain.
摘要:
A data processing system includes at least a first processing node having an input/output (I/O) controller and a second processing including a memory controller for a memory. The memory controller receives, in order, pipelined first and second DMA write operations from the I/O controller, where the first and second DMA write operations target first and second addresses, respectively. In response to the second DMA write operation, the memory controller establishes a state of a domain indicator associated with the second address to indicate an operation scope including the first processing node. In response to the memory controller receiving a data access request specifying the second address and having a scope excluding the first processing node, the memory controller forces the data access request to be reissued with a scope including the first processing node based upon the state of the domain indicator associated with the second address.
摘要:
In a cache coherent data processing system including at least first and second coherency domains, a memory block is stored in a system memory in association with a domain indicator indicating whether or not the memory block is cached, if at all, only within the first coherency domain. A master in the first coherency domain determines whether or not a scope of broadcast transmission of an operation should extend beyond the first coherency domain by reference to the domain indicator stored in the cache and then performs a broadcast of the operation within the cache coherent data processing system in accordance with the determination.
摘要:
In a cache coherent data processing system including at least first and second coherency domains, a master performs a first broadcast of an operation within the cache coherent data processing system that is limited in scope of transmission to the first coherency domain. The master receives a response of the first coherency domain to the first broadcast of the operation. If the response indicates the operation cannot be serviced in the first coherency domain alone, the master increases the scope of transmission by performing a second broadcast of the operation in both the first and second coherency domains. If the response indicates the operation can be serviced in the first coherency domain, the master refrains from performing the second broadcast, so that communication bandwidth utilized to service the operation is reduced.
摘要:
A cache coherent data processing system includes at least first and second coherency domains each including at least one processing unit and a cache memory. The cache memory includes a cache controller, a data array including a data storage location for caching a memory block, and a cache directory. The cache directory includes a tag field for storing an address tag in association with the memory block and a coherency state field associated with the tag field and the data storage location. The coherency state field has a plurality of possible states including a state that indicates that the address tag is valid, that the storage location does not contain valid data, and that the memory block is possibly cached outside of the first coherency domain.
摘要:
A data processing system includes a plurality of processing units coupled for communication by a communication link and a configuration register. The configuration register has a plurality of different settings each corresponding to a respective one of a plurality of different link information allocations. Information is communicated over the communication link in accordance with a particular link information allocation among the plurality of link information allocations determined by a respective setting of the configuration register.