摘要:
A method of synchronizing an initiating processing unit in a multi-processor computer system with other processing units in the system, by assigning a unique tag for each processing unit, and issuing synchronization messages which include the unique tag of an initiating processing unit. The processing units each have a snoop queue for receiving snoop operations and corresponding tags associated with instructions issued by an initiating processing unit, and the processors examine their respective snoop queues to determine whether any snoop operation in those queues has a tag which is the unique tag of the initiating processing unit. A retry message is sent to the initiating processing unit from any of the other processing units which determine that a snoop operation in a snoop queue has a tag which is the unique tag of the initiating processing unit. In response to the retry message, the initiating processing unit re-issues the synchronization message, and the other processors re-examine their respective snoop queues, in response to the re-issuing of the synchronization message, to determine whether any snoop operation in those queues still has a tag which is the unique tag of the initiating processing unit.
摘要:
A method and apparatus for ordering operations and data received by a first bus having a first ordering policy according to a second ordering policy which is different from the first ordering policy, and for transferring the ordered data on a second bus having the second ordering policy. The system includes a plurality of execution units for storing operations and executing the transfer of data between the first and second buses. Each one of the execution units are assigned to a group which represent a class of operations. The apparatus further includes intra prioritizing means, for each group, for prioritizing the stored operations according to the second ordering policy exclusive of the operation stored in the other group. The system also includes inter prioritizing means for determining which one of the prioritized operations can proceed to execute according to the second ordering policy.
摘要:
Cache and architectural specific functions within a cache controller are layered to permit complex operations to be split into equivalent simple operations. Architectural variants of basic operations may thus be devolved into distinct cache and architectural operations and handled separately. The logic supporting the complex operations may thus be simplified and run faster.
摘要:
A method and processor system that substantially eliminates data bus operations when completing updates of an entire cache line with a full store queue entry. The store queue within a processor chip is designed with a series of AND gates connecting individual bits of the byte enable bits of a corresponding entry. The AND output is fed to the STQ controller and signals when the entry is full. When full entries are selected for dispatch to the RC machines, the RC machine is signaled that the entry updates the entire cache line. The RC machine obtains write permission to the line, and then the RC machine overwrites the entire cache line. Because the entire cache line is overwritten, the data of the cache line is not retrieved when the request for the cache line misses at the cache or when data goes state before write permission is obtained by the RC machine.
摘要:
A multiprocessor data processing system includes first and second processors coupled to an interconnect and to a global promotion facility containing a plurality of promotion bit fields. The first processor executes a single acquisition instruction to concurrently acquire a plurality of promotion bit fields exclusive of at least the second processor. In response to execution of the acquisition instruction, the first processor receives an indication of success or failure of the acquisition instruction, wherein the indication indicates success of the acquisition instruction if all of the plurality of promotion bit fields were concurrently acquired by the first processor and indicates failure of the acquisition instruction if fewer than all of the plurality of promotion bit fields were acquired by the first processor.
摘要:
A data processing system includes a global promotion facility and a plurality of processors coupled by an interconnect. In response to execution of an acquisition instruction by a first processor among the plurality of processors, the first processor transmits an address-only operation on the interconnect to acquire a promotion bit field within the global promotion facility exclusive of at least a second processor among the plurality of processors. In response to receipt of a combined response for the address-only operation representing a collective response of others of the plurality of processors to the address-only operation, the first processor determines whether or not acquisition of the promotion bit field was successful by reference to the combined response.
摘要:
A multiprocessor data processing system comprising, in addition to a first and second processor having an respective first and second cache and a main cache directory affiliated with the first processor's cache, a secondary cache directory of the first cache, which contains a subset of cache line addresses from the main cache directory corresponding to cache lines that are in a first or second coherency state, where the second coherency state indicates to the first processor that requests issued from the first processor for a cache line whose address is within the secondary directory should utilize super-coherent data currently available in the first cache and should not be issued on the system interconnect. Additionally, the cache controller logic includes a clear on barrier flag (COBF) associated with the secondary directory, which is set whenever an operation of the first processor is issued to said system interconnect. If a barrier instruction is received by the first processor while the COBF is set, the contents of the secondary directory are immediately flushed and the cache lines are tagged with an invalid state.
摘要:
A method for improving performance of a multiprocessor data processing system comprising snooping a request for data held within a shared cache line on a system bus of the data processing system whose cache contains an updated copy of the shared cache line, and responsive to a snoop of the request by the second processor, issuing a first response on the system bus indicating to the requesting processor that the requesting processor may utilize data currently stored within the shared cache line of a cache of the requesting processor. When the request is snooped by the second processor and the second processor decides to release a lock on the cache line to the requesting processor, the second processor issues a second response on the system bus indicating that the first processor should utilize new/coherent data and then the second processor releases the lock to the first processor.
摘要:
Disclosed herein are a data processing system and method of operating a data processing system that arbitrate between conflicting requests to modify data cached in a shared state and that protect ownership of the cache line granted during such arbitration until modification of the data is complete. The data processing system includes a plurality of agents coupled to an interconnect that supports pipelined transactions. While data associated with a target address are cached at a first agent among the plurality of agents in a shared state, the first agent issues a transaction on the interconnect. In response to snooping the transaction, a second agent provides a snoop response indicating that the second agent has a pending conflicting request and a coherency decision point provides a snoop response granting the first agent ownership of the data. In response to the snoop responses, the first agent is provided with a combined response representing a collective response to the transaction of all of the agents that grants the first agent ownership of the data. In response to the combined response, the first agent is permitted to modify the data.
摘要:
A method of operation within a processor that permits load instructions following barrier instructions in an instruction sequence to be issued speculatively. The barrier instruction is executed and while the barrier operation is pending, a load request associated with the load instruction is speculatively issued. A speculation flag is set to indicate the load instruction was speculatively issued. The flag is reset when an acknowledgment of the barrier operation is received. Data that is returned before the acknowledgment is received is temporarily held, and the data is forwarded to the register and/or execution unit of the processor only after the acknowledgment is received. If a snoop invalidate is detected for the speculatively issued load request before the barrier operation completes, the data is discarded and the load request is re-issued.