摘要:
A system and method for optimizing a series of traces to be executed by a processing core is disclosed. The lines of a trace are sent to an optimizer each time they are sent to a processing core to be executed. Runtime information may be collected on a line of a trace each time that trace is executed by a processing core. The runtime information may be used by the optimizer to better optimize the micro-operations of the lines of the trace. The optimizer optimizes a trace each time the trace is executed to improve the efficiency of future iterations of the trace. Most of the optimizations result in a reduction of the number of μops within the trace. The optimizer may optimize two or more lines at a time in order to find more opportunities to remove μops and shorten the trace. The two lines may be alternately offset so that each line has the maximum allowed number of micro-operations.
摘要:
A method and apparatus for enabling an adaptive replay loop in a processor. More particularly, the present invention relates to allowing instructions in the replay loop to change its relative position, thereby decreasing the latency for execution of instructions, resolving dynamic resource conflicts, and also increasing the overall efficiency of the processor.
摘要:
Embodiments of the present invention relate to a memory management scheme and apparatus that enables efficient memory renaming. The method includes computing a store address, writing the store address in a first storage, writing data associated with the store address to a memory, and de-allocating the store address from the first storage, allocating the store address in a second storage, predicting a load instruction to be memory renamed, computing a load store source index, computing a load address, disambiguating the memory renamed load instruction, and retiring the memory renamed load instruction, if the store instruction is still allocated in at least one of the first storage and the second storage and should have effectively provided to the load the full data. The method may also include re-executing the load instruction without memory renaming, if the store instruction is not in at the first storage or in the second storage.
摘要:
Embodiments of the present invention relate to a method and system for providing virtual identifiers corresponding to physical registers in a computer processor. According to the embodiments, the virtual identifiers may be used to represent the physical registers during operations in a pipeline of the processor.
摘要:
Embodiments of the present invention relate to a system and method for associating a register file cache with a register file in a computer processor.
摘要:
A processor is disclosed and includes at least one core including a first core, and interrupt delay logic. The interrupt delay logic is to receive a first interrupt at a first time and delay the first interrupt from being processed by a first time delay that begins at the first time, unless the first interrupt is pending at a second time when a second interrupt is processed by the first core. If the first interrupt is pending at the second time, the interrupt delay logic is to indicate to the first core to begin to process the first interrupt prior to completion of the first time delay. Other embodiments are disclosed and claimed.
摘要:
Method, apparatus, and program means for a programmable event driven yield mechanism that may activate other threads. In one embodiment, an apparatus includes execution resources to execute a plurality of instructions and a monitor to detect a condition indicating a low level of progress. The monitor can disrupt processing of a program by transferring to a handler in response to detecting the condition indicating a low level of progress. In another embodiment, thread switch logic may be coupled to a plurality of event monitors which monitor events within the multithreading execution logic. The thread switch logic switches threads based at least partially on a programmable condition of one or more of the performance monitors.
摘要:
Method, apparatus, and system for a programmable event-driven yield mechanism. The mechanism may disrupt processing of a program to deliver a yield event. The event may be treated as a fault-like yield event or a trap-like event. For a fault-like yield event, the faulting instruction is canceled before retirement and processor state is not updated before the yield event is delivered. For a trap-like yield event the instruction causing the trap is retired and the yield event is delivered on an interrupt boundary. Multiple pending yield events may be handled according to priority. Other embodiments are also described and claimed.
摘要:
The latencies associated with retrieving instruction information for a main thread are decreased through the use of a simultaneous helper thread. The helper thread is a speculative prefetch thread to perform instruction prefetch and/or trace pre-build for the main thread.
摘要:
A technique for using memory attributes to relay information to a program or other agent. More particularly, embodiments of the invention relate to using memory attribute bits to check various memory properties in an efficient manner.