摘要:
One embodiment of the present invention provides a system for measuring processor performance during speculative-execution. The system starts by executing instructions in a normal-execution mode. The system then enters a speculative-execution episode wherein instructions are speculatively executed without being committed to the architectural state of the processor. While entering the speculative-execution episode the system enables a speculative execution monitor. The system then uses the speculative execution monitor to monitor instructions during the speculative-execution episode to record data values relating to the speculative-execution episode. Upon returning to normal-execution mode, the system disables the speculative execution monitor. The data values recorded by the speculative execution monitor facilitate measuring processor performance during speculative execution.
摘要:
A technique maintains return address stack (RAS) content and alignment of a RAS top-of-stack (TOS) pointer upon detection of a tail-call elimination of a return-type instruction. In at least one embodiment of the invention, an apparatus includes a processor pipeline and at least a first return address stack for maintaining a stack of return addresses associated with instruction flow at a first stage of the processor pipeline. The processor pipeline is configured to maintain the first return address stack unchanged in response to detection of a tail-call elimination sequence of one or more instructions associated with a first call-type instruction encountered by the first stage. The processor pipeline is configured to push a return address associated with the first call-type instruction onto the first return address stack otherwise.
摘要:
A circuit for accessing an associative cache is provided. The circuit includes data selection circuitry and an outcome parallel processing circuit both in communication with the associative cache. The outcome parallel processing circuit is configured to determine whether an accessing of data from the associative cache is one of a cache hit, a cache miss, or a cache mispredict. The circuit further includes a memory in communication with the data selection circuitry and the outcome parallel processing circuit. The memory is configured to store a bank select table, whereby the bank select table is configured to include entries that define a selection of one of a plurality of banks of the associative cache from which to output data. Methods for accessing the associative cache are also described.
摘要:
Embodiments of the present invention provide a system that replaces an entry in a least-recently-used way in a skewed-associative cache. The system starts by receiving a cache line address. The system then generates two or more indices using the cache line address. Next, the system generates two or more intermediate indices using the two or more indices. The system then uses at least one of the two or more indices or the two or more intermediate indices to perform a lookup in one or more lookup tables, wherein the lookup returns a value which identifies a least-recently-used way. Next, the system replaces the entry in the least-recently-used way.
摘要:
One embodiment of the present invention provides a system that samples instructions on a processor that supports speculative-execution. The system starts by selecting an instruction, wherein selecting an instruction involves selecting an instruction that is received from an instruction fetch unit or a deferred queue, wherein the deferred queue holds deferred instructions which are deferred because of an unresolved data dependency. The system then records information about the selected instruction during execution of the selected instruction, whereby the recorded information can be used to determine the performance of the processor.
摘要:
A technique for coordinating execution of instructions in a processor that allows instructions to execute out-of-order includes decoding a particular instruction that is defined in accordance with an instruction set of the processor. A helper sequence of instructions that corresponds to the particular instruction is then introduced into a stream of executable operations. The corresponding helper sequence includes a first artificial dependency instruction that codes a dependency on a register that is not actually employed as a register source or target for an operation performed by the particular instruction.
摘要:
One embodiment of the present invention provides a system that counts speculatively-executed instructions for performance analysis purposes. During operation, the system counts instructions which are normally executed during a normal-execution mode. Next, the system enters a speculative-execution mode wherein instructions are speculatively executed without being committed to the architectural state of the processor. During the speculative-execution mode, the system counts the speculatively-executed instructions in a manner that enables the count of speculatively-executed instructions to be reset if the speculative execution fails.
摘要:
One embodiment of the present invention provides a system that avoids register read-after-write (RAW) hazards upon returning from a speculative-execution mode. This system operates within a processor with an in-order architecture, wherein the processor includes a short-latency scoreboard that delays issuance of instructions that depend upon uncompleted short-latency instructions. During operation, the system issues instructions for execution in program order during execution of a program in a normal-execution mode. Upon encountering a condition (a launch condition) during an instruction (a launch-point instruction), which causes the processor to enter the speculative-execution mode, the system generates a checkpoint that can subsequently be used to return execution of the program to the launch-point instruction, and commences execution in the speculative-execution mode. Upon encountering a condition that causes the processor to leave the speculative-execution mode and return to the launch-point instruction, the system uses the checkpoint to resume execution in the normal-execution mode from the launch-point instruction. In doing so, the system ensures that entries that were in the short-latency scoreboard prior to entering the speculative-execution mode, and which are not yet resolved, are accounted for in order to prevent register RAW hazard when resuming execution from the launch-point instruction.
摘要:
One embodiment of the present invention provides a system that counts speculatively-executed instructions for performance analysis purposes. During operation, the system counts instructions which are normally executed during a normal-execution mode. Next, the system enters a speculative-execution mode wherein instructions are speculatively executed without being committed to the architectural state of the processor. During the speculative-execution mode, the system counts the speculatively-executed instructions in a manner that enables the count of speculatively-executed instructions to be reset if the speculative execution fails.
摘要:
One embodiment of the present invention provides a system which avoids a live-lock state in a processor that supports speculative-execution. The system starts by issuing instructions for execution in program order during execution of a program in a normal-execution mode. Upon encountering a launch condition during the execution of an instruction (a “launch instruction”) which causes the processor to enter a speculative-execution mode, the system checks status indicators associated with a forward progress buffer. If the status indicators indicate that the forward progress buffer contains data for the launch instruction, the system resumes normal-execution mode. Upon resumption of normal-execution mode, the system retrieves the data from a data field contained in the forward progress buffer and executes the launch instruction using the retrieved data as input data for the launch instruction. The system next deasserts the status indicators. The system then continues to issue instructions for execution in program order in normal-execution mode. Using the forward progress buffer in this way prevents the processor from entering a potential live-lock state.