摘要:
A parallel processing architecture for a digital processor capable of alternately operating in a single threaded mode, a SIMD (single instruction, multiple data) mode and a MIMD (multiple instructions, multiple data) mode. The instruction set for the processor includes instructions for switching between modes and exchanging data between the parallel processing paths. The hardware in any instruction path or portion of an instruction path which is not being used is deactivated to save power.
摘要:
This disclosure describes a snooping coherency protocol for a multiprocessor network wherein every processor has its own private cache and bus interface means and the network is connected via a common system bus. Each processor has its own cache directory and image directory that duplicate each other non-atomically. The snooping protocol utilizes the duality of directories coupled with the non-atomicity of directory updates to maximize processor-cache availability and minimize processor-cache access times thus supporting high performance architectures.
摘要:
In a modified Harvard architecture, conventionally, read operations in the same cycle are only implemented when different memory banks are to be accessed by the different read operation. However, when different sublines in the same memory bank are being accessed, cycles may be saved by accessing both sublines in the same cycle.
摘要:
Most recently accessed frames are locked in a cache memory. The most recently accessed frames are likely to be accessed by a task again in the near future and may be locked at the beginning of a task switch or interrupt to improve cache performance. The list of most recently used frames is updated as a task executes and may be embodied as a list of frame addresses or a flag associated with each frame. The list of most recently used frames may be separately maintained for each task if multiple tasks may interrupt each other. An adaptive frame unlocking mechanism is also disclosed that automatically unlocks frames that may cause a significant performance degradation for a task. The adaptive frame unlocking mechanism monitors a number of times a task experiences a frame miss and unlocks a given frame if the number of frame misses exceeds a predefined threshold.
摘要:
A first register stores a value that can be used as a pointer to indirectly address a second register. The first register is referred to as a pointer register and the pointer as a register pointer. The second register may be a conventional register that stores a conventional register value (i.e., a data value or a pointer to a data value stored in external memory) or another pointer register. In certain embodiments, a pointer register can also be used to store conventional register values. Pointer registers of the present invention can be used to implement efficiently certain types of digital processing, such as circular buffers, vector processing, convolutional processing, and partitioned processing, using data in registers rather than memory.
摘要:
An apparatus and method that speeds the processing of data vectors in a digital processor is disclosed. In accordance with the present invention, a vector zero overhead loop with parallel issue processes multiple data elements at the same time, and yet is programmed with readable assembly language and requires neither vector registers nor a lot of extra registers to implement.