摘要:
A system and method for register renaming and allocation in an out-of-order processing system which allows the use of a minimum number of physical registers is described. A link list allows concatenation of a physical register representing a certain instance of the corresponding logical register to the physical register representing the next instance of the same logical register. By adding and removing links in this link list, it is possible to manage the assignment of physical registers to logical registers dynamically. Both the physical registers representing speculative instances and the physical registers representing in-order instances are administrated together. This is done by means of an in-order list, which indicates the physical registers that actually represent the architected state of the machine.
摘要:
A process is disclosed to serialize instructions that are to be processed serially in a multiprocessor system, with the use of a token, where the token can be assigned on request to one of the processors, which thereupon has the right to execute the command. If the command consists of dristibuted tasks, the token remains blocked until the last dependent task belonging to the command has also been executed. It is only then that the token can be assigned to another instruction. Moreover, a device is described to manage this token, which features three states: a first state, in which the token is available, a second state, in which the token is assigned to one of the processors, and a third state, in which the token is blocked, because dependent tasks still have to be carried out. Moreover, a circuit is disclosed with which the token principle that is introduced can be implemented in a simple manner. The token is only available if none of the processors i is in possession of the token and if no dependent task is pending at any of the processors. The OR chaining of signals to form a signal C which is set if the token is not available represents the basic circuitry with which the serialisation of commands consisting of distributed tasks is carried out. The invention is applied particularly in the case of commands such as IPTE (invalidate page-table entry) and SSKE (set storage key extended), which modify the address translation tables in the memory that are used in common by all processors.
摘要:
The present invention relates to a data processing apparatus which comprises a microprogrammable processor 1, a random access control store 4 and a read only control store 5 for storage of microinstructions. The random access control store includes a flag microinstruction (REPmark1) for indicating that another microinstruction (add W, 2, W1), stored in the read only control store 5, is faulty. The control stores are coupled to a multiplexer 8 and are adapted to output the microinstructions in parallel to the multiplexer 8 which is in turn coupled to the processor and which selectively provides output from either the random access control store or the read only control store to the processor 1. The apparatus also includes a decoder coupled to the random access control store for observing the microinstructions output therefrom. The decoder is further coupled to inhibiting logic in the processor and outputs a signal if the flag microinstruction is output from the random access control store. The signal causes the inhibiting logic in the processor to inhibit the processor from carrying out the faulty microinstruction.
摘要:
A set of storage devices together with a method for storing data to the storage devices and retrieving data from the storage devices is presented. The set of storage devices provide the function of a multi-writeport cell through the use of a set of single-writeport cells. The storage devices allow for multiple write accesses. Information contained in the set of storage device is represented by all of the devices together. The stored information may be retrieved via a read operation which accesses a subset of the set of storage devices. A write operation is a staged operation: First, the contents of all of the storage devices which are not to be modified are read. Next, the values that are to be written to a subset B of the set of storage devices are calculated in a way that the contents and the values of subset B together represent the desired result.
摘要:
Data caching for use in a computer system including a lower cache memory and a higher cache memory. The higher cache memory receives a fetch request. It is then determined by the higher cache memory the state of the entry to be replaced next. If the state of the entry to be replaced next indicates that the entry is exclusively owned or modified, the state of the entry to be replaced next is changed such that a following cache access is processed at a higher speed compared to an access processed if the state would stay unchanged.
摘要:
Embodiments relate to controlling observability of transactional and non-transactional stores. An aspect includes receiving one or more store instructions. The one or more store instructions are initiated within an active transaction and include store data. The active transaction effectively delays committing stores to memory until successful completion of the active transaction. The store data is stored in a local storage buffer causing alterations to the local storage buffer from a first state to a second state. A signal is received that the active transaction has terminated. If the active transaction has terminated abnormally then: the local storage buffer is reverted back to the first state if the store data was stored by a transactional store instruction, and is propagated to a shared cache if the store instruction is non-transactional.
摘要:
A method to verify an implemented coherency algorithm of a multi processor environment on a single processor model is described, comprising the steps of: generating a reference model reflecting a private cache hierarchy of a single processor within a multi processor environment, stimulating the private cache hierarchy with simulated requests and/or cross invalidations from a core side and/or from a nest side, augmenting all data available in the private cache hierarchy with two construction dates and two expiration dates, set based on interface events, wherein multi processor coherency is not observed if the cache hierarchy ever returns data to the processor with an expiration date that is older than the latest construction date of all data used before. Further a single processor model and a computer program product to execute said method are described.
摘要:
A computer implemented method of cache bounded reference counting for computer languages having automated memory management in which, for example, a reference to an object “Z” initially stored in an object “O” is fetched and the cache hardware is queried whether the reference to the object “Z” is a valid reference, is in the cache, and has a continuity flag set to “on”. If so, the object “O” is locked for an update, a reference counter is decremented for the object “Z” if the object “Z” resides in the cache, and a return code is set to zero to indicate that the object “Z” is de-referenced and that its storage memory can be released and re-used if the reference counter for the object “Z” reaches zero. Thereafter, the cache hardware is similarly queried regarding an object “N” that will become a new reference of object “O”.
摘要:
A crossbar (20) circuit with multiplexer (22A, 22B) circuits implemented in a polygonal form on a chip. The crossbar can be used for implementing a permutation of input bits (24A, 24B) controlled by a bit vector (25). Horizontal and vertical wiring lengths in the crossbar (20) are reduced by stacking the operand latches (24A, 24B, 25) and horizontal or vertical multiplexers (22A, 22B). This implementation decreases the latency of the crossbar and avoids latches to store intermediated results, thus reducing area and power consumption.
摘要:
The present invention relates to data processing systems with built-in error recovery from a given checkpoint. In order to checkpoint more than one instruction per cycle it is proposed to collect updates of a predetermined maximum number of register contents performed by a respective plurality of CISC/RISC instructions in a buffer (CSB)(60) for checkpoint states, whereby a checkpoint state comprises as many buffer slots as registers can be updated by said plurality of CISC instructions and an entry for a Program Counter value associated with the youngest external instruction of said plurality, and to update an Architected Register Array (ARA)(64) with freshly collected register data after determining that no error was detected in the register data after completion of said youngest external instruction of said plurality of external instructions. Handshake synchronization for consistent updates between storage in an L2-cache (66) via a Store Buffer (65) and an Architected Register Array (ARA) (64) is provided which is based on the youngest instruction ID (40) stored in the Checkpoint State Buffer (CSB) (60).