Abstract:
A system, apparatus and method for ordering a sequence of processing transactions for a plurality of peripheral units. The sequence of transactions is accomplished by mapping an incoming address to a target endpoint. The ordering of the transactions is agnostic to the type of endpoint being targeted and only considers an identifier of the transaction for ordering purposes.
Abstract:
A memory controller comprises memory access circuitry configured to initiate a data access of data stored in a memory in response to a data access hint message received from another node in data communication with the memory controller; to access data stored in the memory in response to a data access request received from another node in data communication with the memory controller and to provide the accessed data as a data access response to the data access request.
Abstract:
An interconnect has coherency control circuitry for performing coherency control operations and a snoop filter for identifying which devices coupled to the interconnect have cached data from a given address. When an address is looked up in the snoop filter and misses, and there is no spare snoop filter entry available, then the snoop filter selects a victim entry corresponding to a victim address, and issues an invalidate transaction for invalidating locally cached copies of the data identified by the victim. The coherency control circuitry for performing coherency checking operations for data access transactions is reused for performing coherency control operations for the invalidate transaction issued by the snoop filter. This greatly reduces the circuitry complexity of the snoop filter.
Abstract:
A data processing apparatus includes one or more cache configuration data stores, a coherence manager, and a shared cache. The coherence manager is configured to track and maintain coherency of cache lines accessed by local caching agents and one or more remote caching agents. The cache lines include local cache lines accessed from a local memory region and remote cache lines accessed from a remote memory region. The shared cache is configured to store local cache lines in a first partition and to store remote cache lines in a second partition. The sizes of the first and second partitions are determined based on values in the one or more cache configuration data stores and may or not overlap. The cache configuration data stores may be programmable by a user or dynamically programmed in response to local memory and remote memory access patterns.
Abstract:
Efficient data transfer between caching domains of a data processing system is achieved by a local coherency node (LCN) of a first caching domain receiving a read request for data associated with a second caching domain, from a requesting node of the first caching domain. The LCN requests the data from the second caching domain via a transfer agent. In response to receiving a cache line containing the data from the second caching domain, the transfer agent sends the cache line to the requesting node, bypassing the LCN and, optionally, sends a read-receipt indicating the state of the cache line to the LCN. The LCN updates a coherency state for the cache line in response to receiving the read-receipt from the transfer agent and a completion acknowledgement from the requesting node. Optionally, the transfer agent may send the cache line via the LCN when congestion is detected in a response channel of the data processing system.
Abstract:
Various implementations described herein are directed to a device with a multi-layered logic structure with multiple layers including a first layer and a second layer arranged vertically in a stacked configuration. The device may have a first cache memory with first interconnect logic disposed in the first layer. The device may have a second cache memory with second interconnect logic disposed in the second layer, wherein the second interconnect logic in the second layer is linked to the first interconnect logic in the first layer.
Abstract:
An apparatus comprises snoop filter storage circuitry to store snoop filter entries corresponding to addresses and comprising sharer information. Control circuitry selects which sharers, among a plurality of sharers capable of holding cached data, should be issued with snoop requests corresponding to a target address, based on the sharer information of the snoop filter entry corresponding to the target address. The control circuitry is capable of setting a given snoop filter entry corresponding to a given address to an imprecise encoding in which the sharer information provides an imprecise description of which sharers hold cached data corresponding to the given address, and the given snoop filter entry comprises at least one sharer count value indicative of a number of sharers holding cached data corresponding to the given address.
Abstract:
An improved protocol for data transfer between a request node and a home node of a data processing network that includes a number of devices coupled via an interconnect fabric is provided that minimizes the number of response messages transported through the interconnect fabric. When congestion is detected in the interconnect fabric, a home node sends a combined response to a write request from a request node. The response is delayed until a data buffer is available at the home node and home node has completed an associated coherence action. When the request node receives a combined response, the data to be written and the acknowledgment are coalesced in the data message.
Abstract:
Apparatus and a corresponding method of operating a hub device, and a target device, in a coherent interconnect system are presented. A cache pre-population request of a set of coherency protocol transactions in the system is received from a requesting master device specifying at least one data item and the hub device responds by cause a cache pre-population trigger of the set of coherency protocol transactions specifying the at least one data item to be transmitted to a target device. This trigger can cause the target device to request that the specified at least one data item is retrieved and brought into cache. Since the target device can therefore decide whether to respond to the trigger or not, it does not receive cached data unsolicited, simplifying its configuration, whilst still allowing some data to be pre-cached.
Abstract:
The present disclosure advantageously provides a system and method for protocol layer tunneling for a data processing system. A system includes an interconnect, a request node coupled to the interconnect, and a home node coupled to the interconnect. The request node includes a request node processor, and the home node includes a home node processor. The request node processor is configured to send, to the home node, a sequence of dynamic requests, receive a sequence of retry requests associated with the sequence of dynamic requests, and send a sequence of static requests associated with the sequence of dynamic requests in response to receiving credit grants from the home node. The home node processor is configured to send the sequence of retry requests in response to receiving the sequence of dynamic requests, determine the credit grants, and send the credit grants.