摘要:
A double cache snoop mechanism in uniprocessor computer systems having a cache and coherent I/O and multiprocessor computer systems reduces the number of cycles that a processor is stalled during a coherency check. The snoop mechanism splits each coherency check, such that a read-only check is first sent to the cache subsystem., and a read-write check is sent thereafter only if there is a cache hit during the read-only check, and there is the need to modify the cache. Average processor pipeline stall time is reduced even though a cache hit results in an additional coherency check because most coherency checks do not result in a cache hit.
摘要:
A multiple round-robin arbitration scheme for a shared bus system that ensures forward progress by each component utilizing the shared bus. In the shared bus system, component modules arbitrate for control of the bus for one or more cycles, and send transactions on the bus during cycles in which they control the bus. The transactions are divided into a set of transaction classes. Certain classes of transactions cannot be issued during certain bus cycles. In certain other cycles, transactions of any class may be issued. The multiple round-robin arbitration scheme ensures forward progress by ensuring that each module seeking to issue a transaction of a given class obtains control of the bus during a cycle when transactions of that class can be issued.
摘要:
A coherency scheme of use with a system having a bus, a main memory, a main memory controller for accessing main memory in response to transactions received on the bus, and a set of processor modules coupled to the bus. Each processor module has a cache memory and is capable of transmitting coherent transactions on the bus to other processor modules and to the main memory controller. Each processor module detects coherent transactions issued on the bus and performs cache coherency checks for each of the coherent transactions. Each processor module has a coherency queue for storing all transactions issued on the bus and for performing coherency checks for the transactions in first-in, first-out order. When a module transmits a coherent transaction on a bus, it places its own transaction into its own coherency queue.
摘要:
A bus system having a bus arbitration scheme. The bus system includes a bus and a plurality of client modules coupled to the bus. Each of the client modules is capable of transmitting information on the bus to another of client module, and only one client module is entitled to transmit information on the bus at any time. A module entitled to transmit information on the bus has control of the bus for a minimum period of time defining a cycle. To determine which module is entitled to use the bus, each client module generates an arbitration signal when it seeks to transmit information on the bus. Each client module has an arbitration signal processor responsive to the arbitration signals for determining whether the module is entitled to transmit information on said bus. The system preferably also contains a host module that informs the client modules what types of transactions allowed on the bus in a given cycle. Each arbitration signal processor preferably is also responsive to the client option signals sent by the host module during an earlier cycle.
摘要:
A cache system buffers data stored in a main memory and utilized by a processor. The cache system includes a first cache, a second cache, a first transfer channel, a second transfer channel and a third transfer channel. The first cache is fully associative. The second cache is directly mapped. The first transfer channel transfers data lines from the main memory to the first cache. The second transfer channel transfers data lines from the first cache to the second cache. The third transfer channel transfers data lines from the second cache to the main memory. Accesses of data lines from the first cache and the second cache are performed in parallel.
摘要:
A distributed-memory multi-processor system includes a plurality of cells communicatively coupled to each other and collectively including a plurality of processors, caches, main memories, and cell controllers. Each of the cells includes at least one of the processors, at least one of the caches, one of the main memories, and one of the cell controllers. Each of the cells is configured to perform memory migration functions for migrating memory from a first one of the main memories to a second one of the main memories in a manner that is invisible to an operating system of the system.
摘要:
A method of accessing a plurality of memories in an interleaved manner using a contiguous logical address space includes providing at least one map table. The at least one map table includes a plurality of entries. Each entry includes a plurality of entry items. Each entry item identifies one of the memories. A first logical address is received. The first logical address includes a plurality of address bits. The plurality of address bits includes a first set of address bits corresponding to a first set of entries in the at least one map table. A first entry in the first set of entries is identified based on the first set and a second set of the address bits. A first entry item in the first entry is identified based on a third set of the address bits. The memory identified by the first entry item is accessed.
摘要:
A shared bus system having a bus and a set of client modules coupled to the bus. Each client module is capable of sending transactions on the bus to other client modules and receiving transactions on the bus from other client modules for processing. Each module has a queue for storing transactions received by the module for processing. A bus controller limits the types of transactions that can be sent on the bus to prevent any module's queue from overflowing.
摘要:
Apparatus and method for efficiently sharing data in support of hardware he coherency and coordinated in software with semaphore instructions. Accordingly, a new instruction called "Load-Bias" which, in addition to normal load operations, requests a private copy of the data, and hints to the hardware cache to try to maintain ownership until the next memory reference from that processor. When used with the Cmpxchg instruction semaphore operation, the Load-Bias instruction will reduce coherency traffic, and minimize the possibility of coherency ping-ponging or system deadlock that causes the condition in which no processor is getting useful work done.
摘要:
A new instruction that ensures that the effects of a control register write will be observed at a well defined time is introduced. Specifically, the present invention introduces the concept of a serialization fence instruction. The serialization fence instruction ensures that after a control register in a computer has been modified, all subsequent instructions will observe the effects of the control register modification. Two different serialization fence instructions are illustrated: a data memory reference serialization fence instruction (SRLZ.d) and an instruction fetch serialization fence instruction (SRLZ.i). The data memory reference serialization fence instruction ensures that subsequent instruction executions and data memory references will observe the effects of the control register write. The instruction fetch serialization fence instruction ensures that the entire machine pipeline, starting at the initial instruction fetch stage, will observe the effects of the control register write.