摘要:
A processor includes a device providing a throttling power output signal. The throttling power output signal is used to determine when to logically throttle the power consumed by the processor. At least one core in the processor includes a pipeline having a decode pipe; and a logical power throttling unit coupled to the device to receive the output signal, and coupled to the decode pipe. Following the logical power throttling unit receiving the power throttling output signal satisfying a predetermined criterion, the logical power throttling unit causes the decode pipe to reduce an average number of instructions decoded per processor cycle without physically changing the processor cycle or any processor supply voltages.
摘要:
One embodiment of the present invention supports execution of a start transactional execution (STE) instruction, which marks the beginning of a block of instructions to be executed transactionally. Upon encountering the STE instruction during execution of a program, the system commences transactional execution of the block of instructions following the STE instruction. Changes made during this transactional execution are not committed to the architectural state of the processor until the transactional execution successfully completes.
摘要:
One embodiment of the present invention provides a system that supports executing a fail instruction, which terminates transactional execution of a block of instructions. During operation, the system facilitates transactional execution of a block of instructions within a program, wherein changes made during the transactional execution are not committed to the architectural state of the processor until the transactional execution successfully completes. If a fail instruction is encountered during this transactional execution, the system terminates the transactional execution without committing results of the transactional execution to the architectural state of the processor.
摘要:
A technique for operating a computing apparatus includes allocating a working register file entry corresponding to a register in a working register file when an instruction referencing the register proceeds through a particular stage of the computing apparatus. The technique maintains the working register file entry until at least a predetermined number of subsequent instructions have similarly proceeded through the particular stage.
摘要:
A processor includes a device providing a throttling power output signal. The throttling power output signal is used to determine when to logically throttle the power consumed by the processor. At least one core in the processor includes a pipeline having a decode pipe; and a logical power throttling unit coupled to the device to receive the output signal, and coupled to the decode pipe. Following the logical power throttling unit receiving the power throttling output signal satisfying a predetermined criterion, the logical power throttling unit causes the decode pipe to reduce an average number of instructions decoded per processor cycle without physically changing the processor cycle or any processor supply voltages.
摘要:
One embodiment of the present invention provides a system that selectively monitors store instructions to support transactional execution of a process, wherein changes made during the transactional execution are not committed to the architectural state of a processor until the transactional execution successfully completes. Upon encountering a store instruction during transactional execution of a block of instructions, the system determines whether the store instruction is a monitored store instruction or an unmonitored store instruction. If the store instruction is a monitored store instruction, the system performs the store operation, and store-marks a cache line associated with the store instruction to facilitate subsequent detection of an interfering data access to the cache line from another process. If the store instruction is an unmonitored store instruction, the system performs the store operation without store-marking the cache line.
摘要:
A user is provided with means to sample memory hierarchy via software. This allows a user to enhance memory-level parallelism via software. A status of information needed for execution of a second computer program instruction is read in response to execution of a first computer program instruction. Execution continues with execution of the second computer program instruction upon the status being a first status. Alternatively, a third computer program instruction is executed upon the status being a second status different from the first status. Thus, execution of the first computer program instruction allows control of the memory hierarchy, which in turn give the user control of the memory hierarchy.
摘要:
One embodiment of the present invention provides a system that facilitates a fast execution restart following speculative execution. During normal operation of the system, a processor executes code on a non-speculative mode. Upon encountering a stall condition, the system checkpoints the state of the processor and executes the code in a speculative mode from the point of the stall. As the processor commences execution in speculative mode, it stores copies of instructions as they are issued into a recovery queue. When the stall condition is ultimately resolved, execution in non-speculative mode is recommenced and the execution units are initially loaded with instructions from the recovery queue, thereby avoiding the delay involved in waiting for instructions to propagate through the fetch and the decode stages of the pipeline. At the same time, the processor begins fetching subsequent instructions following the last instruction in the recovery queue. When all the instructions have been loaded from the recovery queue, the execution units begin receiving the subsequent instructions that have propagated through the fetch and decode stages of the pipeline.
摘要:
One embodiment of the present invention provides a system that facilitates selectively unmarking load-marked cache lines during transactional program execution, wherein load-marked cache lines are monitored during transactional execution to detect interfering accesses from other threads. During operation, the system encounters a release instruction during transactional execution of a block of instructions. In response to the release instruction, the system modifies the state of cache lines, which are specially load-marked to indicate they can be released from monitoring, to account for the release instruction being encountered. In doing so, the system can potentially cause the specially load-marked cache lines to become unmarked. In a variation on this embodiment, upon encountering a commit-and-start-new-transaction instruction, the system modifies load-marked cache lines to account for the commit-and-start-new-transaction instruction being encountered. In doing so, the system causes normally load-marked cache lines to become unmarked, while other specially load-marked cache lines may remain load-marked past the commit-and-start-new-transaction instruction.
摘要:
Mechanisms have been developed for providing great flexibility in processor instruction handling, sequencing and execution. In particular, it has been discovered that a configurable predecode mechanism can be employed to select, for respective instruction patterns, between fixed decode and programmable decode paths provided by a processor. In this way, a patchable and/or programmable decode mechanism can be efficiently provided. In some realizations, either (or both) predecode or (and) decode may be configured or reconfigured post-manufacture. In some realizations, either (or both) predecode or (and) decode may be configured at (or about) initialization. In some realizations, either (or both) predecode or (and) decode may be configured at run-time.