摘要:
A hierarchical encoding format for coding repairs to devices within a computing system. A device, such as a cache memory, is logically partitioned into a plurality of sub-portions. Various portions of the sub-portions are identifiable as different levels of hierarchy of the device. A first sub-portion may corresponds to a particular cache, a second sub-portion may correspond to a particular way of the cache, and so on. The encoding format comprises a series of bits with a first portion corresponding to a first level of the hierarchy, and a second portion of the bits corresponds to a second level of the hierarchy. Each of the first and second portions of bits are preceded by a different valued bit which serves to identify the hierarchy to which the following bits correspond. A sequence of repairs are encoded as string of bits. The bit which follows a complete repair encoding indicates whether a repair to the currently identified cache is indicated or whether a new cache is targeted by the following repair. Therefore, certain repairs may be encoded without respecifying the entire hierarchy.
摘要:
A pipelined or superscalar processor (10) that executes operations utilizing operand data of variable bit widths improves parallel performance by partitioning a fixed bit width operand (200) into several partial operand fields (215, 216 and 217), and checking for data dependencies, tagging and forwarding data in these fields independently of one another. An instruction decoder (18) concurrently dispatches multiple ROPs to various functional units (20, 21, 22 and 80). Conflicts which arise with respect to register resources are resolved through register renaming. However, implementation of register renaming is difficult when register structures are overlapping. The present invention supports independent dependency checking, tagging and forwarding of partial bit fields of a register operand which, in combination, allow renaming of registers. Therefore, the variable width register operand structure greatly assists the processor to resolve data dependencies. Operands are tagged by a reorder buffer (26) and supplied with data when it becomes available without regard for the type of data. This method of dependency resolution supports parallel performance of operations and provides a substantial improvement in overall speed of processing. Thus, the processor promotes parallel processing of operations that act upon overlapping data structures which otherwise resist parallel handling.
摘要:
A pipelined or superscalar processor (10) that executes operations utilizing operand data of variable bit widths improves parallel performance by partitioning a fixed bit width operand (200) into several partial operand fields (215, 216 and 217), and checking for data dependencies, tagging and forwarding data in these fields independently of one another. An instruction decoder (18) concurrently dispatches multiple ROPs to various functional units (20, 21, 22 and 80). Conflicts which arise with respect to register resources are resolved through register renaming. However, implementation of register renaming is difficult when register structures are overlapping. The present invention supports independent dependency checking, tagging and forwarding of partial bit fields of a register operand which, in combination, allow renaming of registers. Therefore, the variable width register operand structure greatly assists the processor to resolve data dependencies. Operands are tagged by a reorder buffer (26) and supplied with data when it becomes available without regard for the type of data. This method of dependency resolution supports parallel performance of operations and provides a substantial improvement in overall speed of processing. Thus, the processor promotes parallel processing of operations that act upon overlapping data structures which otherwise resist parallel handling.
摘要:
A processor is presented including a cache unit coupled to a bus interface unit (BIU). Address signal selection and masking functions are performed by circuitry within the BIU rather than within the cache unit, and physical addresses produced by the BIU are stored within the TLB. As a result, address signal selection and masking circuitry (e.g., a multiplexer and gating logic) are eliminated from a critical speed path within the cache unit, allowing the operational speed of the cache unit to be increased. The cache unit stores data items, and produces a data item corresponding to a received linear address. A translation lookaside buffer (TLB) within the cache unit stores multiple linear addresses and corresponding physical addresses. When a physical address corresponding to the received linear address is not found within the TLB, the cache unit passes the linear address to the BIU. The BIU includes address translation circuitry, a multiplexer, and gating logic, and returns the physical address corresponding to the linear address to the cache unit. The cache unit stores the physical address and the linear address within the TLB. The processor may also include a programmable control register and a microexecution unit. Upon detecting a change in state of an external masking signal, the microexecution unit may flush the contents of the TLB and modify a masking bit within the control register to reflect a new state of the masking signal.
摘要:
A method of operating a computer system. A first processor sends a first unit of binary information to an input/output (I/O) unit. The I/O unit then conveys the first unit of binary information to a functional unit in the computer system. A system response from the functional unit is then received by the I/O unit, which forwards the system response to the first processor. The system response is also stored in a first buffer. After a predetermined delay time has elapsed, the system response is then forwarded to the second processor.
摘要:
A superscalar processor may issue multiple instructions per clock cycle. Included in a superscalar processor may be a reorder buffer which stores information corresponding to concurrently dispatched instructions. Dependencies may exist among the instructions which are concurrently dispatched. To resolve this dependency, when a dependency is detected amongst a group of concurrently dispatched instructions, an indication of the dependency, along with an indication of the position of the dependency, is conveyed to the corresponding reservation station. When the reservation station receives the indication of the dependency, the operand tag associated with the dependency may be replaced with the correct tag. Advantageously, the circuitry needed to resolve the dependency may be moved out of the critical path of the processor; thus, improving the performance of the processor by allowing it to operate at an increased frequency.
摘要:
In a processor (110) that performs multiple instructions in a single cycle, predicts outcomes of branch conditions and speculatively executes instructions based on the branch predictions, a method and apparatus for operating a data stack utilize a remap array (674) to support a stack exchange capability. The remap array is used to correlate a stack pointer (672) to data elements (700) within the stack. A lookahead stack pointer (502) and remap array (504) are updated to preserve the processor's state of operation while speculative instructions are executed.
摘要:
In a processor a reorder buffer maintains a load/store (LS) fault address register (LSFAR). When the processor's load/store unit reports most LS exceptions, the reorder buffer redirects the microcode unit of the processor to execute a fault handler indicated by an address stored in the LSFAR. The LSFAR may be mapped into the register space of the processor. It may be written by a microcode routine with the address of a specific fault handler at the beginning of a microcode routine or at any time during a microcode routine. As the reorder buffer retires instructions it checks for writes to the LSFAR. If one exists, the reorder buffer loads the result data of that write into the LSFAR. In a preferred embodiment the reorder buffer retires instructions in program order and the LSFAR is not updated speculatively. Also, in a preferred embodiment, when a microcode routine exits, the LSFAR is automatically returned to a default value which indicates a generic fault handling routine.
摘要:
A superscalar microprocessor is provided with a reorder buffer for storing the speculative state of the microprocessor and a register file for storing the real state of the microprocessor. A flags register stores the real state of flags that are updated by flag modifying instructions which are executed by the functional units of the microprocessor. To enhance the performance of the microprocessor with respect to conditional branching instructions, the reorder buffer includes a flag storage area for storing flags that are updated by flag modifying instructions. The flags are renamed to make possible the earlier execution of branch instructions which depend on flag modifying instructions. If a flag is not yet determined, then a flag tag is associated with the flag storage area in place of that flag until the actual flag value is determined. A flag operand bus and a flag tag bus are provided between the flag storage area and the branching functional unit so that the requested flag or flag tags are provided to instructions which are executed in the branching functional unit.
摘要:
A enable circuit (700), employing a "circular carry lookahead" technique to increase its speed performance, is provided for applying two pointers to a circular buffer--an enabling pointer (tail (218)) and a disabling pointer (head (216))--and for generating a multiple-bit enable, ENA (722) in accordance with the pointer values. The pointers designate enable bit boundaries for isolating enable bits of one logic level from enable bits of an opposite logic level. The enable circuit includes several lookahead cells (702, 704, 706 and 708) arranged in an hierarchical array, each of the cells including bits that continue the hierarchical significance. Each cell receives an hierarchical portion of the enabling pointer 218 and the disabling pointer head and a carry. From these pointers, the cell derives a generate, a propagate and the enable bits with a corresponding hierarchical significance. The propagates, generates and carries for all of the lookahead cells are interconnected using a circular propagate carry circuit (710) that provides for asserting a carry to a lookahead cell unless an intervening cell having a nonasserted propagate is interposed in the order of hierarchical significance between the cell and a cell in which enablement is generated.