摘要:
A computing system includes a main memory and an input/output adapter. The input/output adapter accesses a translation map. The translation map maps input/output page numbers to memory address page numbers. Entries to the translation map are generated so that each entry includes an address of a data page in the main memory and transaction configuration information. The transaction configuration information is utilized by the input/output adapter during data transactions to and from the data page.
摘要:
A coherency scheme of use with a system having a bus, a main memory, a main memory controller for accessing main memory in response to transactions received on the bus, and a set of processor modules coupled to the bus. Each processor module has a cache memory and is capable of transmitting coherent transactions on the bus to other processor modules and to the main memory controller. Each processor module detects coherent transactions issued on the bus and performs cache coherency checks for each of the coherent transactions. Each processor module has a coherency queue for storing all transactions issued on the bus and for performing coherency checks for the transactions in first-in, first-out order. When a module transmits a coherent transaction on a bus, it places its own transaction into its own coherency queue.
摘要:
A multiple round-robin arbitration scheme for a shared bus system that ensures forward progress by each component utilizing the shared bus. In the shared bus system, component modules arbitrate for control of the bus for one or more cycles, and send transactions on the bus during cycles in which they control the bus. The transactions are divided into a set of transaction classes. Certain classes of transactions cannot be issued during certain bus cycles. In certain other cycles, transactions of any class may be issued. The multiple round-robin arbitration scheme ensures forward progress by ensuring that each module seeking to issue a transaction of a given class obtains control of the bus during a cycle when transactions of that class can be issued.
摘要:
A basic instruction for moving a string of bytes in a word has been devised. Because the operations in the instruction are basic, very few variations are necessary to accommodate diversity of lengths and variables. These operations are imbedded in a single code sequence; the compiler can therefore generate exactly the minimum sequence necessary to perform the operations and can precompute many of the operands at compile time, typically completing the instruction within a single cycle time. The control necessary to optimize the operations is then in the compiler instead of the hardware.
摘要:
A method and apparatus calculate a page table index from a virtual address. Employs a combined hash algorithm that supports two different hash page table configurations. A “short format” page table is provided for each virtual region, is linear, has a linear entry for each translation in the region, and does not store tags or chain links. A single “long format” page table is provided for the entire system, supports chained segments, and includes hash tag fields. The method of the present invention forms an entry address from a virtual address, with the entry address referencing an entry of the page table. To form the entry address, first a hash page number is formed from the virtual address by shifting the virtual address right based on the page size of the region of the virtual address. If the computer system is operating with long format page tables, the next step is to form a hash index by combining the hash page number and the region identifier referenced by the region portion of the virtual address, and to form a table offset by shifting the hash index left by K bits, wherein each long format page table entry is 2K bytes long. However, if the computer system is operating with short format page tables, the next step is to form a hash index by setting the hash index equal to the hash page number, and to form a table offset by shifting the hash index left by L bits, wherein each short format page table entry is 2L bytes long. Next, a mask is formed based on the size of the page table. A first address portion is then formed using the base address of the page table and the mask, and a second address portion is formed using the table offset and the mask. Finally, the entry address is formed by combining the first and second address portions. By providing a single algorithm capable of generating a page table entry for both long and short format page tables, the present invention reduces the amount of logic required to access both page table formats, without significantly affecting execution speed.
摘要:
A method and apparatus pre-validate regions in a virtual addressing scheme by storing both the virtual region number (VRN) bits and region identifiers (RIDs) in translation lookaside buffer (TLB) entries. By storing both the VRN bits and RIDs in TLB entries, the region registers can be bypassed when performing most TLB accesses, thereby removing region registers the critical path of the TLB look-up process and enhancing system performance. A TLB in accordance with the present invention includes entries having a valid field, a region pre-validation valid (rpV) field, a virtual region number (VRN) field, a virtual page number (VPN) field, a region identifier (RID) field, a protection and access attributes field, and a physical page number (PPN) field. In addition, a set of region registers contains the RIDs that are active at any given time. When a virtual-to-physical entry is established for a page in a region having an RID stored in a region register, the RID and VRN are stored in the appropriate fields of the TLB entry. In addition, the valid field is set and the rpV field is set to indicate that the TLB entry contains an active VRN-to-RID mapping, thereby pre-validating the region. When a physical address is translated into a virtual address, a VRN and a VPN are extracted from the virtual address and provided to the TLB. The TLB is searched to find an entry having a set valid field, a set rpV field, and VRN and VPN fields containing entries matching the VRN and VPN extracted from the virtual address. If such an entry is found, the protection and access attributes field is used to determine whether the requested access is allowed. If the requested access is allowed, the PPN from the PPN field of the TLB entry is combined with an offset from the virtual address to produce a physical address that is used to complete the memory access.
摘要:
In a method and apparatus that ensures data consistency between an I/O channel and a processor, system software issues an instruction which causes the issuance of a transaction when notification of a DMA completion is received. The transaction instructs the I/O channel to enforce coherency and then responds back only after coherency has been ensured. Specifically, a DMA.sub.-- SYNC transaction is broadcast to all I/O channels in the system. Responsive thereto, each I/O channel writes back to memory any modified lines in its cache that might contain DMA data for a DMA sequence that was reported by the system as completed. The I/O channels have a reporting means to indicate when this transaction is completed, so that the DMA.sub.-- SYNC transaction does not have to complete in pipeline order. Thus, the I/O channel can issue new transactions before responding to the DMA.sub.-- SYNC transaction.
摘要:
The present invention relates to the design of computer systems incorporating virtual memory where a virtual page number is longer than the inherent basic data width of the designed computer system. Instead of storing an entire tag in page table entries, a reduced tag is stored. The reduced tag is sized to be no greater in length than the basic computer data width and therefore a single compare operation will ascertain whether there is a match between the reduced tag and the tag stored in a page table entry. To maintain uniqueness of the page table entries, any bits removed from the virtual address to form the reduced tag are used to form an index into the page table.
摘要:
A cache system buffers data stored in a main memory and utilized by a processor. The cache system includes a first cache, a second cache, a first transfer channel, a second transfer channel and a third transfer channel. The first cache is fully associative. The second cache is directly mapped. The first transfer channel transfers data lines from the main memory to the first cache. The second transfer channel transfers data lines from the first cache to the second cache. The third transfer channel transfers data lines from the second cache to the main memory. Accesses of data lines from the first cache and the second cache are performed in parallel.
摘要:
A method of retrieving data from a multi-set cache memory in a computer system. An address, which includes an index, is presented by the processor to the cache memory. The index is utilized to access the cache to generate an output which includes a block corresponding to the index from each set of the cache. Each block includes an address tag and data. A portion of the address tag for all but one of the blocks is compared with a corresponding portion of the address. If the comparison results in a match, then the data from the block associated with match is provided to the processor. If the comparison does not result in a match, then the data from the remaining block is provided to the processor. A full address tag comparison is done in parallel with the "lookaside tag" comparison to confirm a "hit."