摘要:
In a multiprocessor data processing system including: a memory, first and second shared caches, a system bus coupling the memory and the shared caches, first, second, third and fourth processors having, respectively, first, second, third and fourth private caches with the first and second private caches being coupled to the first shared cache, and the third and fourth private caches being coupled to the second shared cache, gateword hogging is prevented by providing a gate control flag in each processor. Priority is established for a processor to next acquire ownership of the gate control word by: broadcasting a “set gate control flag” command to all processors such that setting the gate control flags establishes delays during which ownership of the gate control word will not be requested by another processor for predetermined periods established in each processor. Optionally, the processor so acquiring ownership broadcasts a “reset gate control flag” command to all processors when it has acquired ownership of the gate control word.
摘要:
In a multiprocessor data processing system including: a main memory; at least first and second shared caches; a system bus coupling the main memory and the first and second shared caches; at least four processors having respective private caches with the first and second private caches being coupled to the first shared cache and to one another via a first internal bus, and the third and fourth private caches being coupled to the second shared cache and to one another via a second internal bus; method and apparatus for preventing hogging of ownership of a gateword stored in the main memory and which governs access to common code/data shared by processes running in at least three of the processors. Each processor includes a gate control flag. A gateword CLOSE command, establishes ownership of the gateword in one processor and prevents other processors from accessing the code/data guarded until the one processor has completed its use. A gateword OPEN command then broadcasts a gateword interrupt to set the flag in each processor, delays long enough to ensure that the flags have all been set, writes an OPEN value into the gateword and flushes the gateword to main memory. A gateword access command executed by a requesting processor checks its gate control flag, and if set, starts a fixed time delay after which normal execution continues.
摘要:
A multiprocessor write-into-cache data processing system includes a feature for preventing hogging of ownership of a first gateword stored in the memory which governs access to a first common code/data set shared by processes running in the processors by imposing first delays on all other processors in the system while, at the same time, mitigating any adverse effect on performance of processors attempting to access a gateword other than the first gateword. This is achieved by starting a second delay in any processor which is seeking ownership of a gateword other than the first gateword and truncating the first delay in all such processors by subtracting the elapsed time indicated by the second delay from the elapsed time indicated by the first delay.
摘要:
In a multiprocessor write-into-cache data processing system including: a memory; at least first and second shared caches; a system bus coupling the memory and the shared caches; at least one processor having a private cache coupled, respectively, to each shared cache; method and apparatus for preventing hogging of ownership of a gateword stored in the memory which governs access to common code/data shared by processes running in the processors by which a read copy of the gateword is obtained by a given processor by performing successive swap operations between the memory and the given processor's shared cache, and the given processor's shared cache and private cache. If the gateword is found to be OPEN, it is CLOSEd by the given processor, and successive swap operations are performed between the given processor's private cache and shared cache and shared cache and memory to write the gateword CLOSEd in memory such that the given processor obtains exclusive access to the governed common code/data. When the given processor completes use of the common code/data, it writes the gateword OPEN in its private cache, and successive swap operations are performed between the given processor's private cache and shared cache and shared cache and memory to write the gateword OPEN in memory.
摘要:
For a data processing system which employs a cache memory, the disclosure includes both a method for lowering the cache miss ratio for requested operands and an example of special purpose apparatus for practicing the method. Recent cache misses are stored in a first in, first out miss stack, and the stored addresses are searched for displacement patterns thereamong. Any detected pattern is then employed to predict a succeeding cache miss by prefetching from main memory the signal identified by the predictive address. The apparatus for performing this task is preferably hard wired for speed purposes and includes subtraction circuits for evaluating variously displaced addresses in the miss stack and comparator circuits for determining if the outputs from at least two subtraction circuits are the same, indicating a pattern which yields information which can be combined with an address in the stack to develop a predictive address. The efficiency of the apparatus is improved by placing a series of "select pattern" values representing the search order for trying patterns into a register stack and providing logic circuitry by which the most recently found "select pattern" value is placed at the top of the stack with the remaining "select pattern" values pushed down accordingly.
摘要:
A cache unit includes a cache store organized into a number of levels to provide a fast access to instructions and data words. Directory circuits, associated with the cache store, contain address information identifying those instructions and data words stored in the cache store. The cache unit has at least one instruction register for storing address and level signals for specifying the location of the next instruction to be fetched and transferred to the processing unit. Replacement circuits are included which, during normal operation, assign cache locations sequentially for replacing old information with new information. The cache unit further includes detection apparatus for detecting a conflict condition resulting in an improper assignment. The detection apparatus, upon detecting such a condition, advances the relacement circuits forward for assigning the next sequential group of locations or level inhibiting it from making its normal location assignment. It also inhibits the directory circuits from writing the necessary information therein required for making the location assignment and prevents the information which produced the conflict from being written into cache store when received from memory.
摘要:
Interactions among multiple processors (92) are exhaustively tested. A master processor (92) retrieves test information for a set of tests from a test table (148). It then enters a series of embedded loops, with one loop for each of the tested processors (92). A cycle delay count for each of the tested processors (92) is incremented (152, 162, 172) through a range specified in the test table entry. For each combination of cycle delay count loop indices, a single test is executed (176). In each such test (176), the master processor (92) sets up (182) each of the other processors (92) being tested. This setup (182) specifies the delay count and the code for that processor (92) to execute. When each processor (92) is setup (182), it waits (192) for a synchronize interrupt (278). When all processors (92) have been setup (182), the master processor (92) issues (191) the synchronize interrupt signal (276). Each processor (92) then starts traces (193) and delays (194) the specified number of cycles. After the delay, the processor (92) executes its test code (195).
摘要:
A processor (92) in a data processing system (80) provides a DELAY instruction. Executing the DELAY instruction causes the processor (92) to a specified integral number of clock (98) cycles before continuing. Delays are guaranteed to have a linear relationship with a constant slope with the specified number of clock cycles. Incrementing the specified delay through a range allows exhaustive testing of interactions among multiple processors.
摘要:
In a data processing system which employs a cache memory feature, a method and exemplary special purpose apparatus for practicing the method are disclosed to lower the cache miss ratio for called operands. Recent cache misses are stored in a first in, first out miss stack, and the stored addresses are searched for displacement patterns thereamong. Any detected pattern is then employed to predict a succeeding cache miss by prefetching from main memory the signal identified by the predictive address. The apparatus for performing this task is preferably hard wired for speed purposes and includes subtraction circuits for evaluating variously displaced addresses in the miss stack and comparator circuits for determining if the outputs from at least two subtraction circuits are the same indicating a pattern yielding information which can be combined with an address in the stack to develop a predictive address. The cache miss prediction mechanism is only selectively enabled during cache "in-rush" following a process change to increase the recovery rate; thereafter, it is disabled, based upon timing-out a timer or reaching a hit ratio threshold, in order that normal procedures allow the hit ratio to stabilize at a higher percentage than if the cache miss prediction mechanism were operated continuously.
摘要:
In a data processing system which employs a cache memory feature, a method and exemplary special purpose apparatus for practicing the method are disclosed to lower the cache miss ratio for called operands. Recent cache misses are stored in a first in, first out miss stack, and the stored addresses are searched for displacement patterns thereamong. Any detected pattern is then employed to predict a succeeding cache miss by prefetching from main memory the signal identified by the predictive address. The apparatus for performing this task is preferably hard wired for speed purposes and includes subtraction circuits for evaluating variously displaced addresses in the miss stack and comparator circuits for determining if the outputs from at least two subtraction circuits are the same indicating a pattern yielding information which can be combined with an address in the stack to develop a predictive address. The efficiency of the apparatus operating in an environment incorporating a paged main memory is improved, according to the invention, by the addition of logic circuitry which serves to inhibit prefetch if a page boundary would be encountered.