摘要:
A computer system includes one or more microprocessors. The microprocessors assign a priority level to each memory operation as the memory operations are initiated. In one embodiment, the priority levels employed by the microprocessors include a fetch priority level and a prefetch priority level. The fetch priority level is higher priority than the prefetch priority level, and is assigned to memory operations which are the direct result of executing an instruction. The prefetch priority level is assigned to memory operations which are generated according to a prefetch algorithm implemented by the microprocessor. As memory operations are routed through the computer system to main memory and corresponding data transmitted, the elements involved in performing the memory operations are configured to interrupt the transfer of data for the lower priority memory operation in order to perform the data transfer for the higher priority memory operation. While one embodiment of the computer system employs at least a fetch priority and a prefetch priority, the concept of applying priority levels to various memory operations and interrupting data transfers of lower priority memory operations to higher priority memory operations may be extended to other types of memory operations, even if prefetching is not employed within a computer system. For example, speculative memory operations may be prioritized lower than non-speculative memory operations throughout the computer system.
摘要:
A computer system includes one or more microprocessors. The microprocessors assign a priority level to each memory operation as the memory operations are initiated. In one embodiment, the priority levels employed by the microprocessors include a fetch priority level and a prefetch priority level. The fetch priority level is higher priority than the prefetch priority level, and is assigned to memory operations which are the direct result of executing an instruction. The prefetch priority level is assigned to memory operations which are generated according to a prefetch algorithm implemented by the microprocessor. As memory operations are routed through the computer system to main memory and corresponding data transmitted, the elements involved in performing the memory operations are configured to interrupt the transfer of data for the lower priority memory operation in order to perform the data transfer for the higher priority memory operation. While one embodiment of the computer system employs at least a fetch priority and a prefetch priority, the concept of applying priority levels to various memory operations and interrupting data transfers of lower priority memory operations to higher priority memory operations may be extended to other types of memory operations, even if prefetching is not employed within a computer system. For example, speculative memory operations may be prioritized lower than non-speculative memory operations throughout the computer system.
摘要:
A microprocessor includes an expanded set of registers in addition to the architected set of registers specified by the microprocessor architecture employed by the microprocessor. The expanded set of registers are memory-mapped within the context of the program being executed. Upon a context switch, the microprocessor saves the state of the expanded registers to the corresponding memory locations. An application program may make use of the expanded registers by assigning the most-often used operands in the program to the set of memory locations corresponding to the expanded registers. The application programmer may than code instructions which access these operands with register identifiers corresponding to the expanded registers. In one embodiment, the microprocessor implements a portion of the expanded registers instead of the entire set of expanded registers. The implemented portion of the expanded registers are accessed as register accesses, while the unimplemented portion are converted to memory accesses. The decode unit within the microprocessor may be configured to convert instructions which are coded to access the unimplemented expanded registers into memory operations to access the corresponding memory location.
摘要:
A microprocessor assigns a data transaction type to each instruction. The data transaction type is based upon the encoding of the instruction, and indicates an access mode for memory operations corresponding to the instruction. The access mode may, for example, specify caching and prefetching characteristics for the memory operation. The access mode for each data transaction type is selected to enhance the speed of access by the microprocessor to the data, or to enhance the overall cache and prefetching efficiency of the microprocessor by inhibiting caching and/or prefetching for those memory operations. Instead of relying on data memory access patterns and overall program behavior to determine caching and prefetching operations, these operations are determined on an instruction-by-instruction basis. Additionally, the data transaction types assigned to different instruction encodings may be revealed to program developers. Program developers may use the instruction encodings (and instruction encodings which are assigned to a nil data transaction type causing a default access mode) to optimize use of processor resources during program execution.
摘要:
A system may include an instruction cache, a trace cache including a plurality of trace cache entries, and a trace generator coupled to the instruction cache and the trace cache. The trace generator may be configured to receive a group of instructions output by the instruction cache for storage in one of the plurality of trace cache entries. The trace generator may be configured to detect an exceptional instruction within the group of instructions and to prevent the exceptional instruction from being stored in a same one of the plurality of trace cache entries as any non-exceptional instruction.
摘要:
A method and apparatus for retaining flag values when an associated data value dies. A first storage circuit includes a free list for storing physical register names (PRNs) and indications indicative of whether a physical register associated with a PRN was assigned to store a logical register result and flag results of a first instruction and a logical register result and a subsequent instruction which overwrites the logical register result but not the flags. A second storage circuit stores PRNs separate from the free list. The first and second storage circuits output first and second PRNs to a selection circuit. If the first indication (associated with the first PRN) is in a first state, the selection circuit may provide the first PRN to a mapper for assignment to a logical register. If the first indication is in a second state, the second PRN may be provided to the mapper.
摘要:
Integrated circuits having multiple independently accessible microcode ROMs. An integrated circuit may include a microcode unit and a plurality of microcode ROMs fabricated within the same integrated circuit. The microcode unit may be configured to receive a microcoded instruction and to identify a microcode routine that corresponds to the microcoded instruction. The microcode ROMs may collectively store the microcode routines that implement the microcoded instructions of a complex instruction set, and different microcode ROMs may have different access times. At least one of the microcode ROMs may output operations included in the microcode routine in response to the microcode unit identifying the microcode routine. Microcode routines having more performance criticality may be stored in a microcode ROM having a smaller access latency than the access latency of a microcode ROM in which microcode routines having less performance criticality are stored.
摘要:
A technique for constraint management and validation for template-based device designs is disclosed. The technique includes generating a template-level representation of an electronic device design based on a transistor-level representation of the electronic device design. The template-level representation includes one or more hierarchies of templates. Each template represents a corresponding portion of the electronic device design. The technique further includes determining constraint declarations associated with the electronic device design and verifying whether there is a functional equivalence between the template-level representation to a register-transfer-level (RTL) representation of the electronic device design. The technique additionally includes verifying whether the constraint declarations are valid and verifying the electronic device design responsive to verifying the functional equivalence and verifying the constraint declarations.
摘要:
An instruction decode unit is described including circuitry coupled to receive an instruction. The instruction identifies multiple operands, one of which is a destination operand. The circuitry responds to the instruction by producing: (i) operand codes specifying the operands, wherein the operand codes are produced in the order in which the operands are identified within the instruction, and (ii) a destination operand signal identifying the destination operand. In one embodiment, the decode unit responds to the instruction by producing the operand codes, operand address information, control signals, and the destination operand signal. A processor including the instruction decode unit is also described, as is a computer system including the processor. The instruction may include operand information which identifies the operands. The instruction may also include destination operand information which indicates which of the operands is the destination operand. The circuitry may produce the destination operand signal dependent upon the destination operand information. The instruction may be a member of an instruction set including instructions having a variable number of bytes. In one particular example, the instruction may be an x86 instruction including operand information which identifies two operands. The instruction may include a direction bit, and the value of the direction bit may indicate which of the two operands is the destination operand. In this case, the circuitry may produce the destination operand signal dependent upon the value of the direction bit.
摘要:
A test configuration is provided which allows a plurality of variable delay units within a delay chain of a microprocessor clock generator to be compared with respect to one another. During normal operation, a set of multiplexers interposed within the delay chain are configured such that the plurality of variable delay units are electrically coupled in series with respect to one another. An external command signal may be provided to the microprocessor to initiate a test operation in which the variable delay units are tested for possible defects. During the test operation, a control unit selects the multiplexers such that the four delay units are electrically separated from one another. A common test signal is then driven through two or more of the variable delay units simultaneously, and a compare circuit coupled to the output of each variable delay unit determines whether a transition in the common pulse signal propagated through each variable delay unit at essentially the same time. If no manufacturing defects are present, the four outputs of the variable delay units should be virtually indistinguishable from one another. The results of the compare operation may be driven on external pins of the microprocessor or may be processed internally within the microprocessor. Similar tests may be conducted throughout the entire operating range of the variable delay units.