摘要:
Disclosed is a processor that reduces barrier operations during instruction processing. An instruction sequence includes a first barrier instruction and a second barrier instruction with a store instruction in between the first and second barrier instructions. A store request associated with the store instruction is issued prior to a barrier operation associated with the first barrier instruction. A determination is made of when the store request completes before the first barrier instruction has issued. In response, only a single barrier operation is issued for both the first and second barrier instructions. The single barrier operation is issued after the store request has been issued and at the time the second barrier operation is scheduled to be issued.
摘要:
A method for increasing performance optimization in a multiprocessor data processing system. A number of predetermined thresholds are provided within a system controller logic and utilized to trigger specific bandwidth utilization responses. Both an address bus and data bus bandwidth utilization are monitored. Responsive to a fall of a percentage of data bus bandwidth utilization below a first predetermined threshold value, the system controller provides a particular response to a request for a cache line at a snooping processor having the cache line, where the response indicates to a requesting processor that the cache line will be provided. Conversely, if the percentage of data bus bandwidth utilization rises above a second predetermined threshold value, the system controller provides a next response to the request that indicates to any requesting processors that the requesting processor should utilize super-coherent data which is currently within its local cache. Similar operation on the address bus permits the system controller to triggering the issuing of Z1 Read requests for modified data in a shared cache line by processors which still have super-coherent data. The method also comprises enabling a load instruction with a plurality of bits that (1) indicates whether a resulting load request may receive super-coherent data and (2) overrides a coherency state indicating utilization of super-coherent data when said plurality of bits indicates that said load request may not utilize said super-coherent data. Specialized store instructions with appended bits and related functionality are also provided.
摘要:
Cache and architectural functions within a cache controller are layered and provided with generic interfaces. Layering cache and architectural operations allows the definition of generic interfaces between controller logic and bus interface units within the controller. The generic interfaces are defined by extracting the essence of supported operations into a generic protocol. The interfaces themselves may be pulsed or held interfaces, depending on the character of the operation. Because the controller logic is isolated from the specific protocols required by a processor or bus architecture, the design may be directly transferred to new controllers for different protocols or processors by modifying the bus interface units appropriately.
摘要:
Cache and architectural functions within a cache controller are layered so that architectural operations may be symmetrically treated regardless of whether initiated by a local processor or by a horizontal processor. The same cache controller logic which handles architectural operations initiated by a horizontal device also handles architectural operations initiated by a local processor. Architectural operations initiated by a local processor are passed to the system bus and self-snooped by the controller. If necessary, the architectural controller changes the operation protocol to conform to the system bus architecture.
摘要:
A method and apparatus for ordering operations and data received by a first bus having a first ordering policy according to a second ordering policy which is different from the first ordering policy, and for transferring the ordered data on a second bus having the second ordering policy. The system includes a plurality of execution units for storing operations and executing the transfer of data between the first and second buses. Each one of the execution units are assigned to a group which represent a class of operations. The apparatus further includes intra prioritizing means, for each group, for prioritizing the stored operations according to the second ordering policy exclusive of the operation stored in the other group. The system also includes inter prioritizing means for determining which one of the prioritized operations can proceed to execute according to the second ordering policy.
摘要:
A method and system for controlling access to a shared resource in a data processing system are described. According to the method, a number of requests for access to the resource are generated by a number of requesters that share the resource. Each of the requesters is assigned a current priority, at least the highest current priority being determined substantially randomly with respect to previous priorities of the requestors. In response to the current priorities of the requestors, a request for access to the resource is granted. In one embodiment, a requester corresponding to a granted request is signaled that its request has been granted, and a requester corresponding to a rejected request is signaled that its request was not granted.
摘要:
Cache and architectural functions within a cache controller are layered and provided with generic interfaces, isolating controller logic from specific architectural complexities. Controller logic may thus be readily duplicated to extend a nonshared cache controller design to a shared cache controller design, with only straightforward modifications required. Throttling of processor-initiated operations handled by the same controller logic resolves operation flow rate issues with acceptable performance trade-offs.
摘要:
A method of providing instructions and data values to a processing unit in a multi-processor computer system, by expanding the prior-art MESI cache-coherency protocol to include an additional cache-entry state corresponding to a most recently accessed state. Each cache of the processing units has at least one cache line with a block for storing the instruction or data value, and an indication is provided that a cache line having a block which contains the instruction or data value is in a "recently read" state. Each cache entry has three bits to indicate the current state of the cache entry (one of five possible states). A processing unit which desires to access a shared instruction or data value detects transmission of the indication from the cache having the most recently accessed copy, and the instruction or data value is sourced from this cache. Upon sourcing the instruction or data value, the cache that originally contained the most recently accessed copy thereof changes its indication to indicate that its copy is now shared, and the processing unit which accessed the instruction or data value is thereafter indicated as having the cache containing the copy thereof that was most recently accessed. This protocol allows instructions and data values which are shared among several caches to be sourced directly (intervened) by the cache having the most recently accessed copy, without retrieval from system memory (RAM), significantly improving the processing speed of the computer system.
摘要:
A system and method are provided that use a determination of bad data parity and the state of an error signal (Derr.sub.--) as a functional signal indicating a specific type of error in a particular system component. If the Derr.sub.-- signal is active, the parity error recognized by the CPU was caused by a correctable condition in a data providing device. In this instance, the processor will read the corrected data from a buffer without reissuing a fetch request. When the CPU finds a parity error, but Derr.sub.-- is not active a more serious fault condition is identified (bus error or uncorrectable multibit error) requiring a machine level interrupt, or the like. And, when no parity is found by the CPU and Derr.sub.-- is not active, then the data is known to be valid and the parity/ECC latency is eliminated, thereby saving processing cycle time.
摘要:
A queued arbitration mechanism transfers all queued processor bus requests to a centralized system controller/arbiter in a descriptive and pipelined manner. Transferring these descriptive and pipelined bus requests to the system controller allows the system controller to optimize the system bus utilization via prioritization of all of the requested bus operations and pipelining appropriate bus grants. Intelligent bus request information is transferred to the system controller via encoding and serialization techniques.