摘要:
A cache, system and method for reducing the number of rejected snoop requests. A “stall/reorder unit” in a cache receives a snoop request from an interconnect. The snoop request is entered in the first available latch of the stall/reorder unit unless the stall/reorder unit is full in which case the new snoop request is transmitted to a second unit configured to transmit a request to retry resending the new snoop request. Snoop requests have a higher priority than requests from processors and snoop requests are selected by the arbitration mechanism over processor requests unless the arbitration mechanism requests otherwise (“stall request”) to the stall/reorder unit. By snoop requests having a higher priority than processor requests, the number of snoop requests rejected is reduced. By having the arbitration mechanism issue a stall request, the processor will not be starved.
摘要:
A cache, system and method for reducing the number of rejected snoop requests. An incoming snoop request is entered in the first available latch in a pipeline of latches in a stall/reorder unit if the stall/reorder unit is not full. The entered snoop request is dispatched to a selector upon entering a bottom latch in the pipeline. The stall/reorder unit is not informed as to whether the dispatched snoop request is accepted by an arbitration mechanism for several clock cycles after the dispatch occurred. A copy of the dispatched snoop request is stored in a top latch in an overrun pipeline of latches in the first unit upon dispatching the snoop request. By maintaining information about the snoop request, the snoop request may be dispatched again to the selector in case the dispatched snoop request was rejected thereby increasing the chance that the snoop request will ultimately be accepted.
摘要:
A system and method for re-ordering store operations from a processor core to a store queue. When a store queue receives a new processor-issued store operation from the processor core, a store queue controller allocates a new entry in the store queue. In response to allocating the new entry in the store queue, the store queue controller determines whether or not the new entry is dependent on at least one other valid entry in the store queue. In response to determining the new entry is dependent on at least one other valid entry in the store queue, the store queue controller inhibits requesting of the new entry to the RC dispatch logic until each valid entry on which the new entry is dependent has been successfully dispatched to an RC machine by the RC dispatch logic.
摘要:
A data processing system includes a processor core and a memory subsystem. The memory subsystem includes a store queue having a plurality of entries, where each entry includes an address field for holding the target address of store operation, a data field for holding data for the store operation, and a virtual sync field indicating a presence or absence of a synchronizing operation associated with the entry. The memory subsystem further includes a store queue controller that, responsive to receipt at the memory subsystem of a sequence of operations including a synchronizing operation and a particular store operation, places a target address and data of the particular store operation within the address field and data field, respectively, of an entry in the store queue and sets the virtual sync field of the entry to represent the synchronizing operation, such that a number of store queue entries utilized is reduced.
摘要:
A data processing system includes a memory subsystem and an execution unit, coupled to the memory subsystem, which executes store instructions to determine target memory addresses of store operations to be performed by the memory subsystem. The data processing system further includes a mode field having a first setting indicating strong ordering between store operations and a second setting indicating weak ordering between store operations. Store operations accessing the memory subsystem are associated with either the first setting or the second setting. The data processing system also includes logic that, based upon settings of the mode field, inserts a synchronizing operation between a store operation associated with the first setting and a store operation associated with the second setting, such that all store operations preceding the synchronizing operation complete before store operations subsequent to the synchronizing operation.
摘要:
In response to execution of program code, a control register within scrubbing logic in a local coherency domain is initialized with at least a target address of a target memory block. In response to the initialization, the scrubbing logic issues to at least one cache hierarchy in a remote coherency domain a domain indication scrubbing request targeting a target memory block that may be cached by the at least one cache hierarchy. In response to receipt of a coherency response indicating that the target memory block is not cached in the remote coherency domain, a domain indication in the local coherency domain is updated to indicate that the target memory block is cached, if at all, only within the local coherency domain.
摘要:
A processing unit for a multiprocessor data processing system includes a store-through upper level cache, an instruction sequencing unit that fetches instructions for execution, at least one instruction execution unit that executes a store-conditional instruction to determine a store target address, a store queue that, following execution of the store-conditional instruction, buffers a corresponding store operation, sequencer logic associated with the store queue. The sequencer logic, responsive to receipt of a latency indication indicating that resolution of the store-conditional operation as passing or failing is subject to significant latency, invalidates, prior to resolution of the store-conditional operation, a cache line in the store-through upper level cache to which a load-reserve operation previously bound.
摘要:
According to a method of data processing, a predictor is maintained that indicates a historical scope of broadcast for one or more previous operations transmitted on an interconnect of a data processing system. A scope of broadcast of a subsequent operation is predictively selected by reference to the predictor.
摘要:
Scrubbing logic in a local coherency domain issues a domain query request to at least one cache hierarchy in a remote coherency domain. The domain query request is a non-destructive probe of a coherency state associated with a target memory block by the at least one cache hierarchy. A coherency response to the domain query request is received. In response to the coherency response indicating that the target memory block is not cached in the remote coherency domain, a domain indication in the local coherency domain is updated to indicate that the target memory block is cached, if at all, only within the local coherency domain.
摘要:
A method, system, and device for enabling intervention across same-level cache memories. In a preferred embodiment, responsive to a cache miss in a first cache memory a direct intervention request is sent from the first cache memory to a second cache memory requesting a direct intervention that satisfies the cache miss.