摘要:
A system and method is disclosed for performing error recovery in a data processing system that supports multiple processing partitions. One or more processors and I/O modules, as well as a portion of the address space of a main memory, is allocated to each partition. In this type of configuration, requests generated by units of multiple partitions are processed by the same queue and state logic of the main memory. When a failure occurs within one processing partition, one or more units are identified as being directly affected by the fault. All requests and responses from, and to, the affected units, as well as any logical residue of these requests and responses are removed from the shared memory queue and state logic in a manner that allows the other partition to continue issuing requests and responses to the memory in a normal manner that does not involve recovery operations.
摘要:
A system and method for testing and/or initializing a Directory Store in a directory-based coherent memory. In one illustrative embodiment, the directory-based coherent memory includes a Main Store for storing a number of data entries, a Directory Store for storing the directory state for at least some of the data entries in the Main Store, and a next state block for determining a next directory state for a requested data entry in response to a memory request. To provide access to the Directory Store, and in one illustrative embodiment, a selector is provided for selecting either the next directory state value provided by the next state block or another predetermined value. The other predetermined value may be, for example, a fixed data pattern, a variable data pattern, a specified value, or any other value suitable for initializing and/or testing the Directory Store. The output of the selector may be written to the Directory Store.
摘要:
Poisoning of specific memory locations as a process when a part of a multiprocessor computer system becomes faulty leads to ability to isolate specific data owned by individual failing units even in a shared memory area. Also continuous processing by non-failing units is allowable. A support processor handles non-immediate problems and allows resetting of memory locations formerly owned by failed units.
摘要:
A directory-based cache coherency system is disclosed for use in a data processing system having multiple Instruction Processors (IP) and multiple Input/Output (I/O) units coupled through a shared main memory. The system includes one or more IP cache memories, each coupled to one or more IPs and to the shared main memory for caching units of data referred to as cache lines. The system further includes one or more I/O memories within ones of the I/O units, each I/O memory being coupled to the shared main memory for storing cache lines retrieved from the shared main memory. Coherency is maintained through the use of a central directory which stores status for each of the cache lines in the system. The status indicates the identity of the IP caches and the I/O memories having valid copies of a given cache line, and further identifies a set of access privileges, that is, the cache line “state”, associated with the cache line. The cache line states are used to implement a state machine which tracks the cache lines and ensures only valid copies of are maintained within the memory system. According to another aspect of the system, the main memory performs continuous tracking and control functions for all cache lines residing in the IP caches. In contrast, the system maintains tracking and control functions for only predetermined cache lines provided to the I/O units so that system overhead may be reduced. The coherency system further supports multiple heterogeneous instruction processors which operate on cache lines of different sizes.
摘要:
A data by-pass system for a hierarchical, multi-level, memory is disclosed. The by-pass system provides by-pass interfaces between storage devices located at predetermined levels within the memory hierarchy. The hierarchical memory system of the preferred embodiment includes a main memory coupled to multiple first storage devices that each stores addressable portions of data signals retrieved from the main memory. To facilitate a more efficient transfer of data between the various storage devices in the memory system, at least one by-pass interface coupling associated ones of the first storage devices is provided. Data retrieved from a target one of the first storage devices in response to a main memory request can be routed to a different requesting one of the first storage devices via the by-pass system without requiring the use of the main memory data interfaces.
摘要:
A programmable address translation system for a modular main memory is provided. The system is implemented using one or more General Register Arrays (GRAs), wherein each GRA performs logical-to-physical address translation for a predetermined address range within the system. Predetermined bits of a logical address are used to address a GRA associated with the logical address range. Data bits read from the GRA are then substituted for the predetermined bits of the logical address to form the physical address. In this manner, non-contiguous addressable banks of physical memory may be mapped to a selectable contiguous address range. By including within the GRA Address a number N of logical address bits used to address contiguous logical addresses, an address translation mechanism is provided which may be programmed to perform between 2-way and 2N-way address interleaving. Each GRA may be re-programmed dynamically to accommodate changing memory conditions as may occur, for example, when a range of memory is logically removed from a system because of errors. Furthermore, GRA reprogramming may occur while memory operations continue within other non-associated address ranges. Additionally, address interleaving may be selected for certain ones of the address ranges, whereas a non-interleaving scheme may be selected for other address ranges.
摘要:
Method and apparatus for a computer system to efficiently operate with multiple instruction processors and input/output subsystem in a symmetrical multi-processing environment. The computer system uses a new storage controller having a high performance interconnect scheme that scales in system performance as additional common storage controller modules are added. The interconnect scheme has the cost advantage of a bus connected system while achieving the performance characteristics of a crossbar connected system.
摘要:
Systems and methods for testing and validation of translated memory banks used in an emulated system are disclosed. One method includes translating one or more banks of non-native instructions into one or more banks of native instructions executable in a computing system having a native instruction set architecture. The one or more banks of non-native instructions define one or more tests of execution of a non-native instruction set architecture. The method also includes loading a memory with instructions and data defined according to the non-native instruction set architecture and addressed by the one or more tests, and triggering, by an emulator, execution of the translated one or more banks of native instructions. The method further includes, upon detection of an error during execution of the translated one or more banks of native instructions, identifying an error in execution of the non-native instruction set architecture by the computing system.
摘要:
A dual-channel memory system and accompanying coherency mechanism is disclosed. The memory includes both a request and a response channel. The memory provides data to a requester such as an instruction processor via the response channel. If this data is provided for update purposes, other read-only copies of the data must be invalidated. This invalidation may occur after the data is provided for update purposes, and is accomplished by issuing one or more invalidation requests via one of the memory request or the response channel. Memory coherency is maintained by preventing a requester from storing any data back to memory until all invalidation activities that may be directly or indirectly associated with that data have been completed.
摘要:
An apparatus for and method of improving the efficiency of a level two cache memory. In response to a level one cache miss, a request is made to the level two cache. A signal sent with the request identifies when the requester does not anticipate a near term subsequent use for the requested data element. If a level two cache hit occurs, the requested data element is marked as least recently used in response to the signal. If a level two cache miss occurs, a request is made to level three storage. When the level three storage request is honored, the requested data element is immediately flushed from the level two cache memory in response to the signal.