摘要:
Methods and apparatus for instruction restarts and inclusion in processor micro-op caches are disclosed. Embodiments of micro-op caches have way storage fields to record the instruction-cache ways storing corresponding macroinstructions. Instruction-cache in-use indications associated with the instruction-cache lines storing the instructions are updated upon micro-op cache hits. In-use indications can be located using the recorded instruction-cache ways in micro-op cache lines. Victim-cache deallocation micro-ops are enqueued in a micro-op queue after micro-op cache miss synchronizations, responsive to evictions from the instruction-cache into a victim-cache. Inclusion logic also locates and evicts micro-op cache lines corresponding to the recorded instruction-cache ways, responsive to evictions from the instruction-cache.
摘要:
Methods and apparatus for inclusion of TLB (translation look-aside buffer) in processor micro-op caches are disclosed. Some embodiments for inclusion of TLB entries have micro-op cache inclusion fields, which are set responsive to accessing the TLB entry. Inclusion logic may the flush the micro-op cache or portions of the micro-op cache and clear corresponding inclusion fields responsive to a replacement or invalidation of a TLB entry whenever its associated inclusion field had been set. Front-end processor state may also be cleared and instructions refetched when replacement resulted from a TLB miss.
摘要:
Methods and apparatus for inclusion of TLB (translation look-aside buffer) in processor micro-op caches are disclosed. Some embodiments for inclusion of TLB entries have micro-op cache inclusion fields, which are set responsive to accessing the TLB entry. Inclusion logic may the flush the micro-op cache or portions of the micro-op cache and clear corresponding inclusion fields responsive to a replacement or invalidation of a TLB entry whenever its associated inclusion field had been set. Front-end processor state may also be cleared and instructions refetched when replacement resulted from a TLB miss.
摘要:
Methods and apparatus for instruction restarts and inclusion in processor micro-op caches are disclosed. Embodiments of micro-op caches have way storage fields to record the instruction-cache ways storing corresponding macroinstructions. Instruction-cache in-use indications associated with the instruction-cache lines storing the instructions are updated upon micro-op cache hits. In-use indications can be located using the recorded instruction-cache ways in micro-op cache lines. Victim-cache deallocation micro-ops are enqueued in a micro-op queue after micro-op cache miss synchronizations, responsive to evictions from the instruction-cache into a victim-cache. Inclusion logic also locates and evicts micro-op cache lines corresponding to the recorded instruction-cache ways, responsive to evictions from the instruction-cache.
摘要:
Methods and apparatus for instruction restarts and inclusion in processor micro-op caches are disclosed. Embodiments of micro-op caches have way storage fields to record the instruction-cache ways storing corresponding macroinstructions. Instruction-cache in-use indications associated with the instruction-cache lines storing the instructions are updated upon micro-op cache hits. In-use indications can be located using the recorded instruction-cache ways in micro-op cache lines. Victim-cache deallocation micro-ops are enqueued in a micro-op queue after micro-op cache miss synchronizations, responsive to evictions from the instruction-cache into a victim-cache. Inclusion logic also locates and evicts micro-op cache lines corresponding to the recorded instruction-cache ways, responsive to evictions from the instruction-cache.
摘要:
Methods and apparatus for using micro-op caches in processors are disclosed. A tag match for an instruction pointer retrieves a set of micro-op cache line access tuples having matching tags. The set is stored in a match queue. Line access tuples from the match queue are used to access cache lines in a micro-op cache data array to supply a micro-op queue. On a micro-op cache miss, a macroinstruction translation engine (MITE) decodes macroinstructions to supply the micro-op queue. Instruction pointers are stored in a miss queue for fetching macroinstructions from the MITE. The MITE may be disabled to conserve power when the miss queue is empty-likewise for the micro-op cache data array when the match queue is empty. Synchronization flags in the last micro-op from the micro-op cache on a subsequent micro-op cache miss indicate where micro-ops from the MITE merge with micro-ops from the micro-op cache.
摘要:
Methods and apparatus for using micro-op caches in processors are disclosed. A tag match for an instruction pointer retrieves a set of micro-op cache line access tuples having matching tags. The set is stored in a match queue. Line access tuples from the match queue are used to access cache lines in a micro-op cache data array to supply a micro-op queue. On a micro-op cache miss, a macroinstruction translation engine (MITE) decodes macroinstructions to supply the micro-op queue. Instruction pointers are stored in a miss queue for fetching macroinstructions from the MITE. The MITE may be disabled to conserve power when the miss queue is empty-likewise for the micro-op cache data array when the match queue is empty. Synchronization flags in the last micro-op from the micro-op cache on a subsequent micro-op cache miss indicate where micro-ops from the MITE merge with micro-ops from the micro-op cache.
摘要:
A technique to enable efficient instruction fusion within a computer system. In one embodiment, a processor logic delays the processing of a second instruction for a threshold amount of time if a first instruction within an instruction queue is fusible with the second instruction.
摘要:
A technique to enable efficient instruction fusion within a computer system. In one embodiment, a processor logic delays the processing of a second instruction for a threshold amount of time if a first instruction within an instruction queue is fusible with the second instruction.
摘要:
A technique to enable efficient instruction fusion within a computer system. In one embodiment, a processor logic delays the processing of a second instruction for a threshold amount of time if a first instruction within an instruction queue is fusible with the second instruction.