摘要:
An out of program control order execution data processor that comprises an issue unit, execution means, a floating point exception unit a precise state unit, a floating point status register, and writing means. The issue unit issues instructions in program control order for execution. The issued instructions include floating point instructions and non-floating point instructions. The execution means executes the issued instructions such that at least the floating point instructions may be executed out of program control order by the execution means. The floating point exception unit includes a data storage structure including storage elements. Each issued instruction corresponds to one of the storage elements. Each storage element has a floating point instruction identifying field and a floating point trap type field. The floating point exception unit also includes first logic to write, for each issued instruction, data in the floating point instruction identifying field of the corresponding storage element which indicates whether or not the corresponding issued instruction is a floating point instruction. It further includes second logic to write, for each issued floating point instruction which causes during execution one or more floating point execution exceptions that will result in a corresponding one of a plurality of predefined types of floating point execution traps, data in the floating point trap type field of the corresponding storage element which identifies the one of the predefined types of floating point execution traps that will result. The precise state means retires each issued instruction which does not cause an execution exception during execution and for which all issued instructions preceding it in program control order have been retired. When a first one of the predefined execution exceptions is caused by an issued instruction, the execution means continues execution of issued instructions and the precise state means engages in execution trap sequencing by continuing to retire issued instructions until it encounters an issued instruction that cannot be retired. The issued instruction that cannot be retired being one of (a) the issued instruction that caused the first execution exception, and (b) an issued instruction that was issued earlier than the issued instruction that caused the first execution exception but which caused a second execution exception occurring later than the first execution exception. The floating point status register has a floating point trap type field. The writing means writes data to the floating point trap type field of the floating point status register which identifies the type of floating point execution trap identified by the data in the floating point trap type field of the storage element corresponding to the instruction that cannot be retired when the data in the floating point identifying field of the storage element corresponding to the instruction that cannot be retired indicates that the instruction that cannot be retired is a floating point instruction.
摘要:
A data processor and associated method for taking and returning from traps speculatively. The data processor supports a predefined number of trap levels for taking nested traps each having a corresponding trap level. The data processor comprises means to form checkpoints, means to back up to the checkpoints, means to take a trap, means to return from a trap, registers, and a trap stack unit. The registers have contents that define the state of the data processor each time a trap is taken. The trap stack unit includes a trap stack data storage structure that has a greater number of trap slack storage entries than there are trap levels. It also includes a freelist unit that maintains a current availability list of the trap stack storage entries that are currently available for mapping to one of the trap levels. The freelist unit identifies, each time a trap is taken, a next one of the currently available trap stack storage entries for mapping to the corresponding one of the trap levels. The trap stack unit further includes read/write logic that writes, for each trap taken, the contents of the registers to the next one of the currently available trap stack storage entries. It still further includes rename mapping logic that maintains a current mapping of each trap level to one of the trap stack storage entries. The rename mapping logic replaces, each time a trap is taken, an old mapping of the corresponding trap level to one of the trap stack storage entries with a current mapping of the corresponding trap level to the next one of the currently available trap stack storage entries. The trap stack unit also includes a resource reclaim unit that maintains an unavailability list of each trap stack storage entry not currently mapped to one of the trap levels by the current mappings but unavailable for mapping to one of the trap levels. The resource reclaim unit adds to the unavailability list, each time a trap is taken, the trap stack storage entry that was mapped to the corresponding trap level by the old mapping and removing from the unavailability list, each time a taken trap can no longer be undone, the trap stack storage entry that was mapped to the corresponding trap level by the old mapping. The freelist unit adds each trap stack storage entry removed from the unavailability list to the current availability list. Finally, the trap stack unit includes a checkpoint storage unit that includes checkpoint storage entries. Each formed checkpoint has a corresponding checkpoint storage entry so that the checkpoint storage unit stores, for each formed checkpoint, the current mappings of the rename mapping logic and the current availability list of the freelist unit in the corresponding checkpoint storage entry. For each backup to a checkpoint, the rename mapping logic replaces the current mappings it maintains with the mappings stored in the corresponding checkpoint storage entry and the freelist unit replaces the current availability list it maintains with the availability list stored in the corresponding checkpoint storage entry.
摘要:
A method and apparatus for generating a check sum and a syndrome for detecting errors in a series of bytes comprising a plurality of stages, each stage comprising a plurality of networks of exclusive OR gates, a memory and an exclusive OR gate for exclusively ORing the outputs of the networks resulting from a byte transmitted therethrough with the results stored in a memory in a previous stage due to a previous byte. Each of the stages and the networks therein correspond to a term in a Reed-Solomon polynomial. Except for differences in the number and construction of the networks in each stage, each of the stages are substantially identical and can be selectively used for detecting single and double burst errors.
摘要:
A data structure that includes pointers to vertex attributes and primitive descriptions is generated and then processed within a general processing cluster. The general processing cluster includes a vertex attribute fetch unit that fetches from memory vertex attributes corresponding to the vertices defined by the primitive descriptions.
摘要:
One embodiment of the present invention sets forth a technique for collecting operands specified by an instruction. As a sequence of instructions is received the operands specified by the instructions are assigned to ports, so that each one of the operands specified by a single instruction is assigned to a different port. Reading of the operands from a multi-bank register file is scheduled by selecting an operand from each one of the different ports to produce an operand read request and ensuring that two or more of the selected operands are not stored in the same bank of the multi-bank register file. The operands specified by the operand read request are read from the multi-bank register file in a single clock cycle. Each instruction is then executed as the operands specified by the instruction are read from the multi-bank register file and collected over one or more clock cycles.
摘要:
One embodiment of the present invention sets forth a technique for computing virtual addresses for accessing thread data. Components of the complete virtual address for a thread group are used to determine whether or not a cache line corresponding to the complete virtual address is not allocated in the cache. Actual computation of the complete virtual address is deferred until after determining that a cache line corresponding to the complete virtual address is not allocated in the cache.
摘要:
One embodiment of the present invention sets forth a technique for collecting operands specified by an instruction. As a sequence of instructions is received the operands specified by the instructions are assigned to ports, so that each one of the operands specified by a single instruction is assigned to a different port. Reading of the operands from a multi-bank register file is scheduled by selecting an operand from each one of the different ports to produce an operand read request and ensuring that two or more of the selected operands are not stored in the same bank of the multi-bank register file. The operands specified by the operand read request are read from the multi-bank register file in a single clock cycle. Each instruction is then executed as the operands specified by the instruction are read from the multi-bank register file and collected over one or more clock cycles.
摘要:
In a microprocessor, an apparatus is included for coordinating the use of physical registers in the microprocessor. Upon receiving an instruction, the coordination apparatus extracts source and destination logical registers from the instruction. For the destination logical register, the apparatus assigns a physical address to correspond to the logical register. In so doing, the apparatus stores the former relationship between the logical register and another physical register. Storing this former relationship allows the apparatus to backstep to a particular instruction when an execution exception is encountered. Also, the apparatus checks the instruction to determine whether it is a speculative branch instruction. If so, then the apparatus creates a checkpoint by storing selected state information. This checkpoint provides a reference point to which the processor may later backup if it is determined that a speculated branch was incorrectly predicted. Overall, the apparatus coordinates the use of physical registers in the processor in such a way that: (1) logical/physical register relationships are easily changeable; and (2) backup and backstep procedures are accommodated.
摘要:
Time-out checkpoints are formed based on a predetermined time-out condition or interval since the last checkpoint was formed rather than forming a checkpoint to store current processor state based merely on decoded Instruction attributes. Such time-out conditions may include the number of instructions issued or the number of clock cycles elapsed, for example. Time-out checkpointing limits the maximum number of instructions within a checkpoint boundary and bounds the time period for recovery from an exception condition. The processor can restore time-out based checkpointed state faster than an instruction decode based checkpoint technique in the event of an exception so long as the instruction window size is greater than the maximum number of instructions within a checkpoint boundary, end such method eliminates processor state restoration dependency on instruction window size. Time-out checkpoints may be implemented with conventional checkpoints, or in a novel logical and physical register rename map checkpointing technique. Timeout checkpoint formation may be used with conventional processor backup techniques as well as with a novel backtracking technique including processor backup and backstepping.
摘要:
A memory device (28) executes memory access operations of two or more storage locations concurrently. The memory device (28) is comprised of a plurality of memory bank decode logic circuits (30, 32, 56) and a plurality of memory banks (34, 52). Each of the decode logic circuits decodes a first information and control signal set to enable a first memory bank to begin and complete a memory access operation. Each memory bank is comprised of a plurality of latch circuits (39,40, 42, 50) to store a predetermined information and control signal set necessary to perform the memory access operation. A second control signal and information set may, therefore, enable a second memory bank within the memory device (28) to perform a second memory access operation concurrently in time with the first memory access operation.