摘要:
A cache coherent data processing system includes at least first and second coherency domains coupled for communication. The first and second coherency domains each include a respective one of first and second cache memories. A master in the first coherency domain selects a scope of an initial broadcast of an operation from among a first scope including only the first coherency domain and a second scope including both the first and second coherency domains based, at least in part, upon a type of the operation. The master then performs an initial broadcast of the operation within the cache coherent data processing system utilizing the selected scope.
摘要:
A cache memory logically associates a cache line with at least two cache sectors of a cache array wherein different sectors have different output latencies and, for a load hit, selectively enables the cache sectors based on their latency to output the cache line over successive clock cycles. Larger wires having a higher transmission speed are preferably used to output the cache line corresponding to the requested memory block. In the illustrative embodiment the cache is arranged with rows and columns of the cache sectors, and a given cache line is spread across sectors in different columns, with at least one portion of the given cache line being located in a first column having a first latency, and another portion of the given cache line being located in a second column having a second latency greater than the first latency. One set of wires oriented along a horizontal direction may be used to output the cache line, while another set of wires oriented along a vertical direction may be used for maintenance of the cache sectors. A given cache line is further preferably spread across sectors in different rows or cache ways. For example, a cache line can be 128 bytes and spread across four sectors in four different columns, each sector containing 32 bytes of the cache line, and the cache line is output over four successive clock cycles with one sector being transmitted during each of the four cycles.
摘要:
A method and processor system that substantially eliminates data bus operations when completing updates of an entire cache line with a full store queue entry. The store queue within a processor chip is designed with a series of AND gates connecting individual bits of the byte enable bits of a corresponding entry. The AND output is fed to the STQ controller and signals when the entry is full. When full entries are selected for dispatch to the RC machines, the RC machine is signaled that the entry updates the entire cache line. The RC machine obtains write permission to the line, and then the RC machine overwrites the entire cache line. Because the entire cache line is overwritten, the data of the cache line is not retrieved when the request for the cache line misses at the cache or when data goes state before write permission is obtained by the RC machine.
摘要:
A data processing system includes a plurality of requestors and a memory controller for a system memory. In response to receiving from the requestor a read-type request targeting a memory block in the system memory, the memory controller protects the memory block from modification, and in response to an indication that the memory controller is responsible for servicing the read-type request, the memory controller transmits the memory block to the requestor. Prior to receipt of the memory block by the requestor, the memory controller ends protection of the memory block from modification, and the requestor begins protection of the memory block from modification. In response to receipt of the memory block, the requestor ends its protection of the memory block from modification.
摘要:
A processing unit for a multiprocessor data processing system includes a processor core and a lower level cache including a reservation logic that records reservations of the processor core. The reservation logic passes or fails store-conditional operations received from the processor core based upon whether the processor core has reservations for target store addresses of the store-conditional operations. The processor core includes a store-through upper level cache, a reservation register, and sequencer logic that, by reference to the reservation register, fails a store-conditional operation without communication with said reservation logic.
摘要:
A method, system, and processor chip design for reducing the latency between completing a LARX operation and receiving the associated STCX operation to complete the update to the cache line. Each entry of the store queue of the issuing processor is provided an additional tracking bit (priority bit). The priority bit is set whenever a STCX operation is placed within the entry. During selection of an entry for dispatch by the arbitration logic, the arbitration logic scans the value of the priority bits of each eligible entry. An entry with the priority bit set is given priority in the selection process within architectural rules. That entry is then selected for dispatch as early as is possible within the established rules.
摘要:
A data processing system includes a memory controller of a system memory that receives first and second castout operations both specifying a same address. In response to receiving said first and second castout operations, the memory controller performs a single update to the system memory.
摘要:
A data processing system includes a memory controller of a system memory that receives first and second castout operations both specifying a same address. In response to receiving said first and second castout operations, the memory controller performs a single update to the system memory.
摘要:
A method and processor chip design for enabling a processor core to continue sending store operations speculatively to the store queue after the core receives indication that the store queue is full. The processor core is configured with speculative store logic that enables the processor core to continue issuing store operations while the store queue full signal is asserted. A copy of the speculatively issued store operation is placed within a speculative store buffer. The core waits for a signal from the store queue indicating the store operation was accepted into the store queue. When the speculatively-issued store operation is accepted within the store queue, the copy is discarded from the buffer. However, when the store operation is rejected, the speculative store logic re-issues the store operation ahead of normal store operations.
摘要:
A method for informing a processor of a selected order of transmission of data to the processor. The method comprises the steps of coupling system components via a data bus to the processor to effectuate data transfer, determining at the system component logic the order in which to transmit data to the processor, and issuing to the data bus a selected order bit concurrent with the data, wherein the selected order bit alerts the processor of the order and the data is transmitted in that order. In a preferred embodiment, the system component is the cache and the method may involve receiving at the cache a preference of ordering for a read address/request from the processor. The preference order logic of the cache controller or a preference order logic component evaluates the preference of ordering desired by comparing the processor preference with other preferences, including cache order preference. One preference order is selected and the data is then retrieved from a cache line of the cache in the order selected.