摘要:
A cache coherent data processing system includes at least first and second coherency domains. The first coherency domain contains a memory controller, an associated system memory having a target memory block identified by a target address, and a domain indicator indicating whether the target memory block is cached outside the first coherency domain. During operation, the first coherency domain receives a flush operation broadcast to the first and second coherency domains, where the flush operation specifies the target address of the target memory block. The first coherency domain also receives a combined response for the flush operation representing a system-wide response to the flush operation. In response to receipt in the first coherency domain of the combined response, a determination is made if the combined response indicates that a cached copy of the target memory block may remain within the data processing system. In response to a determination that the combined response indicates that a cached copy of the target memory block may remain in the data processing system, the domain indicator is updated to indicate that the target memory block is cached outside of the first coherency domain.
摘要:
A data processing system includes at least first and second coherency domains, each including at least one processor core and a memory. In response to an initialization operation by a processor core that indicates a target memory block to be initialized, a cache memory in the first coherency domain determines a coherency state of the target memory block with respect to the cache memory. In response to the determination, the cache memory selects a scope of broadcast of an initialization request identifying the target memory block. A narrower scope including the first coherency domain and excluding the second coherency domain is selected in response to a determination of a first coherency state, and a broader scope including the first coherency domain and the second coherency domain is selected in response to a determination of a second coherency state. The cache memory then broadcasts an initialization request with the selected scope. In response to the initialization request, the target memory block is initialized within a memory of the data processing system to an initialization value.
摘要:
A data processing system includes a processor core and a memory subsystem. The memory subsystem includes a store queue having a plurality of entries, where each entry includes an address field for holding the target address of store operation, a data field for holding data for the store operation, and a virtual sync field indicating a presence or absence of a synchronizing operation associated with the entry. The memory subsystem further includes a store queue controller that, responsive to receipt at the memory subsystem of a sequence of operations including a synchronizing operation and a particular store operation, places a target address and data of the particular store operation within the address field and data field, respectively, of an entry in the store queue and sets the virtual sync field of the entry to represent the synchronizing operation, such that a number of store queue entries utilized is reduced.
摘要:
A cache coherent data processing system includes a memory and at least first and second coherency domains that each include a respective one of first and second cache memories. A master in the first coherency domain selects a scope of an initial broadcast of an operation targeting a request address allocated to the memory from among a first scope including only the first coherency domain and a second scope including both the first and second coherency domains. The master selects the scope based, at least in part, upon whether the memory belongs to the first coherency domain and performs an initial broadcast of the operation within the cache coherent data processing system utilizing the selected scope.
摘要:
A cache coherent data processing system includes at least first and second coherency domains. In a first cache memory within the first coherency domain of the data processing system, a coherency state field associated with a storage location and an address tag is set to a first data-invalid coherency state that indicates that the address tag is valid and that the storage location does not contain valid data. In response to snooping an exclusive access operation, the exclusive access request specifying a target address matching the address tag and indicating a relative domain location of a requester that initiated the exclusive access operation, the first cache memory updates the coherency state field from the first data-invalid coherency state to a second data-invalid coherency state that indicates that the address tag is valid, that the storage location does not contain valid data, and whether a target memory block associated with the address tag is cached within the first coherency domain upon successful completion of the exclusive access operation based upon the relative location of the requestor.
摘要:
A processing unit includes a local processor core and a cache memory coupled to the local processor core. The cache memory includes a data array, a directory of contents of the data array. The cache memory further includes one or more state machines that service a first set of memory access requests, an arbiter that directs servicing of a second set of memory access requests by reference to the data array and the directory on a fixed schedule, address collision logic that protects memory access requests in the second set by detecting and signaling address conflicts between active memory access requests in the second set and subsequent memory access requests, and dispatch logic coupled to the address collision logic. The dispatch logic dispatches memory access requests in the first set to the one or more state machines for servicing and signals the arbiter to direct servicing of memory access requests in the second set according to the fixed schedule.
摘要:
According to a method of data processing, a predictor is maintained that indicates a historical scope of broadcast for one or more previous operations transmitted on an interconnect of a data processing system. A scope of broadcast of a subsequent operation is predictively selected by reference to the predictor.
摘要:
A data processing system includes a first plane including a first plurality of processing nodes, each including multiple processing units, and a second plane including a second plurality of processing nodes, each including multiple processing units. The data processing system also includes a plurality of point-to-point first tier links. Each of the first plurality and second plurality of processing nodes includes one or more first tier links among the plurality of first tier links, where the first tier link(s) within each processing node connect a pair of processing units in the same processing node for communication. The data processing system further includes a plurality of point-to-point second tier links. At least a first of the plurality of second tier links connects processing units in different ones of the first plurality of processing nodes, at least a second of the plurality of second tier links connects processing units in different ones of the second plurality of processing nodes, and at least a third of the plurality of second tier links connects a processing unit in the first plane to a processing unit in the second plane.
摘要:
A cache coherent data processing system includes at least first and second coherency domains each including at least one processing unit. The first coherency domain includes a first cache memory, and the second coherency domain includes a coherent second cache memory. The first cache memory within the first coherency domain of the data processing system holds a memory block in a storage location associated with an address tag and a coherency state field. The coherency state field is set to a state that indicates that the address tag is valid, that the storage location does not contain valid data, and that the memory block is likely cached only within the first coherency domain.
摘要:
A cache, system and method for reducing the number of rejected snoop requests. A “stall/reorder unit” in a cache receives a snoop request from an interconnect. The snoop request is entered in the first available latch of the stall/reorder unit unless the stall/reorder unit is full in which case the new snoop request is transmitted to a second unit configured to transmit a request to retry resending the new snoop request. Snoop requests have a higher priority than requests from processors and snoop requests are selected by the arbitration mechanism over processor requests unless the arbitration mechanism requests otherwise (“stall request”) to the stall/reorder unit. By snoop requests having a higher priority than processor requests, the number of snoop requests rejected is reduced. By having the arbitration mechanism issue a stall request, the processor will not be starved.