摘要:
A microprocessor with a floating point unit configured to rapidly execute floating point load control word (FLDCW) type instructions in an out of program order context is disclosed. The floating point unit is configured to schedule instructions older than the FLDCW-type instruction before the FLDCW-type instruction is scheduled. The FLDCW-type instruction acts as a barrier to prevent instructions occurring after the FLDCW-type instruction in program order from executing before the FLDCW-type instruction. Indicator bits may be used to simplify instruction scheduling, and copies of the floating point control word may be stored for instruction that have long execution cycles. A method and computer configured to rapidly execute FLDCW-type instructions in an out of program order context are also disclosed.
摘要:
A microprocessor configured to dynamically switch its floating point load pipeline length from one stage in length to more than one stage in length is disclosed. The microprocessor may perform normal loads and detect denormal loads in a single clock cycle. The microprocessor temporarily stores each scheduled floating point instruction in a reissue buffer for at least one clock cycle. When a denormal load instruction is detected, the microprocessor is configured to add one or more stages to the floating point load pipeline to allow the denormal value to complete the conversion to an internal format. The longer pipeline is then used for all loads that follow the denormal load until there is an idle clock cycle or an abort occurs. At that point, the pipeline reverts back to its original shorter state. In addition, the microprocessor may be configured to cancel instructions scheduled assuming the denormal load would take only one clock cycle to complete. The canceled instruction is then “replayed” during a later clock cycle from the reissue buffer. A method for performing denormal loads and a computer system are also disclosed.
摘要:
A processing system includes sequential entries for storing operations of different types and a scan chain which can identify an operation of a first type which follows after an operation of a second type. The first and second types can be identical so that the scan chain identifies the second operation of a particular type in the sequence. The scan chain includes single-entry "generate", "propagate", "kill", and "only" terms which control a scan bit. Conceptually, if the "only" term is not asserted, an entry of the second type generates the scan bit and asserts the "only" term. After the "only" term is asserted, further generation of the scan bit is inhibited. Each entry either propagates the scan bit to the next entry or if the entry is of the first type, kills the scan bit and identifies itself as the selected entry. Look-ahead logic determines group terms from single-entry terms to indicate whether a scan bit would be generated, propagated, or killed by a group of entries. Accordingly, the scan bit is not required to propagate through every entry, and scans can be performed quickly.
摘要:
A microprocessor having an instruction queue capable of out-of-order instruction dispatch and rapidly selecting one or more oldest eligible entries is disclosed. The microprocessor may comprise a plurality of instruction execution pipelines, an instruction cache, and an instruction queue coupled to the instruction cache and execution pipelines. The instruction queue may comprise a plurality of instruction storage locations and may be configured to output up to a predetermined number of non-sequential out of order instructions per clock cycle. The microprocessor may be further configured with high speed control logic coupled to the instruction queue. The control logic may comprise a number of pluralities of multiplexers, wherein the first plurality of multiplexers are configured to select a first subset of the instructions stored in the queue. The second plurality of multiplexers then select a second subset of instructions from the first subset. This process is repeated in each successive plurality of multiplexers until the oldest eligible entry is selected. A data queue and method for managing a queue are also contemplated, as is a computer system utilizing the above-mentioned microprocessor.
摘要:
A method and apparatus called a cache insertion selector for selecting a slot of a memory cache in which to insert data. The access history of a slot is monitored with a single boolean variable called "used recently". A slot is marked as "used recently" when it is accessed. When a new entry is to be inserted, the cache insertion selector of the present invention attempts to select a slot which is not marked as "used recently". If all slots are marked as used recently, the cache insertion selector marks all slots as not used recently and selects one slot. A slot can be specified for unconditional selection. Also, a slot can be precluded from being selected.
摘要:
A microprocessor having an instruction queue capable of out-of-order instruction dispatch and efficiently detect full conditions is disclosed. The microprocessor may comprise a plurality of instruction execution pipelines, an instruction cache, and an instruction queue coupled to the instruction cache and execution pipelines. The instruction queue may comprise a plurality of instruction storage locations and may be configured to output up to a predetermined number of non-sequential out of order instructions per clock cycle. The microprocessor may be further configured with high speed control logic coupled to the instruction queue. Instead of determining exactly how many empty storage locations are present in the queue, the control logic may be configured to determine whether the number of non-overlapping strings of empty storage locations is greater than or equal to the number of estimated instructions currently on their way to being stored in the instruction queue. A data queue and method for managing a queue are also disclosed, as is a computer system utilizing the above-mentioned microprocessor.
摘要:
A circuit includes a sequential entries for storing objects of different types and a scan chain which can identify an object of a first type which follows after an object of a second type. The first and second types can be identical so that the scan chain identifies the second object of a particular type in the sequence. The scan chain includes single-entry "generate", "propagate", "kill", and "only" terms which control a scan bit. Conceptually, if the "only" term is not asserted, an entry of the second type generates the scan bit and asserts the "only" term. After the "only" term is asserted, further generation of the scan bit is inhibited. Each entry either propagates the scan bit to the next entry or if the entry is of the first type, kills the scan bit and identifies itself as the selected entry. Look-ahead logic determines group terms from single-entry terms to indicate whether a scan bit would be generated, propagated, or killed by a group of entries. Accordingly, the scan bit is not required to propagate through every entry, and scans can be performed quickly.
摘要:
In one embodiment, a method for selecting transistor threshold voltages on an integrated circuit may include assigning a first threshold voltage to selected groups of transistors such as cell instances, for example, and determining which of the selected groups of transistors to assign a second threshold voltage, that is lower than the first threshold voltage, by iteratively performing a cost/benefit analysis. The method may further include determining which of the selected groups of transistors having a third threshold voltage to assign the first threshold voltage by iteratively performing a cost/benefit analysis. The cost/benefit analysis may include calculating a cost/benefit ratio for each group of the selected groups of transistors. In addition, the cost/benefit analysis may include calculating an upcone benefit and a downcone benefit for groups of transistors coupled to one or more inputs and outputs, respectively.
摘要:
Scheduler logic which tracks the relative age of stores with respect to a particular load (and of loads with respect to a particular store) allows a load-store execution controller constructed in accordance with the present invention to hold younger stores until the completion of older loads (and to hold younger loads until completion of older stores). Address matching logic allows a load-store execution controller constructed in accordance with the present invention to avoid load-store (and store-load) dependencies. Hierarchical scan logic supplies the relative age indications of loads with respect to stores (and of stores with respect to loads).
摘要:
A microprocessor having an instruction queue capable of out-of-order instruction dispatch and compaction of unaligned strings of empty storage locations is disclosed. The microprocessor may comprise a plurality of instruction execution pipelines, an instruction cache, and an instruction queue coupled to the instruction cache and execution pipelines. The instruction queue may comprise a plurality of instruction storage locations, each coupled to a single destination storage location. The instruction queue may be configured to output up to a predetermined number of non-sequential out of order instructions per clock cycle. As the instructions are output, gaps of empty storage locations may be formed in the queue. The microprocessor may be configured to compact out strings of empty storage locations greater than a predetermined number. This compaction may be performed by selectively shifting the instructions remaining in the queue either zero or N storage locations, wherein N is a predetermined positive integer. This configuration may simplify control logic associated with the queue while still compacting out many of the empty storage locations. A data queue and method for managing a queue are also contemplated, as is a computer system utilizing the above-mentioned microprocessor.