摘要:
The invention controls maximum average power dissipation by stalling high power instructions through the pipeline of a pipelined processor. A power dissipation controller stalls the high power instructions in order to control the processor's maximum average power dissipation. Preferably, the controller is modeled after a capacitive system with a constant output rate and a throttled input rate: the output rate represents the steady state maximum average power dissipation; while the input rate is stalled based upon current capacity, representing thermal response time. At start-up, the capacity is initialized. Yet for each high power instruction, the capacity increases by a weighted value. Each clock capacity is also decreased by a variable output rate. In particular, a low power operation is inserted to the stage execution circuit where the stall is desired, creating a low power state for that circuit. This stall effectively creates a “hole” at that pipeline stage, thus temporarily reducing power dissipation. The invention takes advantage of the fact that the presence of an instruction at any stage execution circuit dissipates power and that the absence (i.e., a “hole”) of an instruction at any stage dissipates less power. By controlling where and when a hole occurs within the pipeline, the maximum average power dissipation of the processor is controlled.
摘要:
An apparatus and method provide an apparatus and method for reducing noise production and power consumption in a logic device that uses monotonic logic encoded signals. In particular, the apparatus is accomplished by a recode circuitry that receives and recodes a monotonic logic encoded signal received from a first logic circuit in the logic device, into a reduced switching signal. The recode circuitry sends the reduced switching signal to a second logic circuit. A decode circuitry receives and decodes the reduced switching signal back into a monotonic logic encoded signal. The decode circuitry then sends the monotonic logic encoded signal to a second logic circuit in the logic device. The method is accomplished by receiving a monotonic logic encoded signal from a first logic circuit. The monotonic logic encoded signal is converted into a reduced switching signal and transmitted. The reduced switching signal is received and converted back into the monotonic logic encoded signal.
摘要:
A floating-point unit of a computer includes a floating-point computation unit, floating-point registers and a floating-point status register. The floating-point status register may include a main status field and one or more alternate status fields. Each of the status fields contains flag and control information. Different floating-point operations may be associated with different status fields. Subfields of the floating-point status register may be updated dynamically during operation. The control bits of the alternate status fields may include a trap disable bit for deferring interruptions during speculative execution. A widest range exponent control bit in the status fields may be used to prevent interruptions when the exponent of an intermediate result is within the range of the register format but exceeds the range of the memory format. The floating-point data may be stored in big endian or little endian format.
摘要:
A floating-point unit of a computer includes a floating-point computation unit, floating-point registers and a floating-point status register. The floating-point status register may include a main status field and one or more alternate status fields. Each of the status fields contains flag and control information. Different floating-point operations may be associated with different status fields. Subfields of the floating-point status register may be updated dynamically during operation. The control bits of the alternate status fields may include a trap disable bit for deferring interruptions during speculative execution. A widest range exponent control bit in the status fields may be used to prevent interruptions when the exponent of an intermediate result is within the range of the register format but exceeds the range of the memory format. The floating-point data may be stored in big endian or little endian format.
摘要:
A floating-point unit of a computer includes a floating-point computation it, floating-point registers and a floating-point status register. The floating-point status register may include a main status field and one or more alternate status fields. Each of the status fields contains flag and control information. Different floating-point operations may be associated with different status fields. Subfields of the floating-point status register may be updated dynamically during operation. The control bits of the alternate status fields may include a trap disable bit for deferring interruptions during speculative execution. A widest range exponent control bit in the status fields may be used to prevent interruptions when the exponent of an intermediate result is within the range of the register format but exceeds the range of the memory format. The floating-point data may be stored in big endian or little endian format.
摘要:
Method and apparatus for storing and executing an instruction to load two independent registers with two values is disclosed. In one embodiment, a computer-readable medium is encoded with an instruction including an opcode field specifying that the instruction is an instruction to load two independent registers with a first value and a second value, a source field specifying the first value and the second value, a first target register field specifying a first target register to load with the first value; a second target register field specifying a second target register to load with the second value. A system to execute the instruction is also disclosed.
摘要:
The operation of a pipeline is observed by launching two or more sets of data into the pipeline on consecutive clock cycles. The clock free-runs for as many cycles as it takes the data to propagate through the stages of the pipeline. The output latches of each stage of the pipeline are only sampled when the data of interest is held in each output latch, respectively. Observation may be completely controlled through a standard test access port (TAP). Observation may be accomplished by halting the clock to scan new data in and results out, or with the clock free-running. The inputs to the pipeline may come from test registers or from circuitry which feeds the pipeline during normal operation.
摘要:
A dual transparent latch circuit is disclosed comprising two latches cross coupled together by two control lines to enable the latches collectively to input and output data at twice the frequency of the master clock frequency which controls the timing of each latch individually. The control lines are controlled by a clock generator such that one latch is enabled to receive and store data while the other latch is enabled to output data stored therein. At the same time, the latch receiving and storing the data is disabled from providing an output of the stored data and the latch providing the output is disabled from receiving and storing the data. The clock generator switches the states of the control lines such that they enable or disable the input of data to and output of data from the latches on each phase of the master clock signal. A dual transparent latch with triple edge timing is also disclosed. A method for generating a master signal having a master frequency and selectively enabling inputs at an input data rate greater than the master frequency to input data into memory and selectively enabling outputs at the input data rate to output data from memory.
摘要:
An apparatus for performing floating-point division and square root computations according to an IEEE rounding standard includes input data alignment circuitry, core iteration circuitry, remainder compare circuitry, and round and select circuitry. The core iteration circuitry includes digit selector circuitry; remainder registers; quotient logic circuitry; remainder formation circuitry; and quotient registers for storing the quotient Q, incremented quotient Q+1, and decremented quotient Q-1. The remainder formation circuitry produces sum and carry bits of the P.sub.j+1 term, which are in turn fed back to the partial remainder registers and used in subsequent iterations. The quotient logic circuitry builds the quotient Q and maintains the respective quotient Q, Q+1, Q-1 registers. The outputs of these registers are fed back to the quotient logic circuitry for use in subsequent iterations. The remainder compare circuitry comprises a remainder comparator and a logic circuit. The remainder comparator receives the sum and carry bits for the P.sub.j+1 terms and outputs the "Sign" and "Zero" bits. These bits are received by the logic circuit along with a rounding mode signal, which is indicative of the selected rounding mode, e.g., shifted or normalized round to nearest, round to zero, or round to infinity. The logic circuit outputs a round select signal that selects a quotient select signal for selecting, as the final rounded quotient, the output of one of the quotient registers Q, Q+1, or Q-1. The round and select circuitry includes a round block for positive remainders and a round block for negative remainders.
摘要:
The invention controls maximum average power dissipation by stalling high power instructions through the pipeline of a pipelined processor. A power dissipation controller stalls the high power instructions in order to control the processor's maximum average power dissipation. Preferably, the controller is modeled after a capacitive system with a constant output rate and a throttled input rate: the output rate represents the steady state maximum average power dissipation; while the input rate is stalled based upon current capacity, representing thermal response time. At start-up, the capacity is initialized. Yet for each high power instruction, the capacity increases by a weighted value. Each clock capacity is also decreased by a variable output rate. In particular, a low power operation is inserted to the stage execution circuit where the stall is desired, creating a low power state for that circuit. This stall effectively creates a “hole” at that pipeline stage, thus temporarily reducing power dissipation. The invention takes advantage of the fact that the presence of an instruction at any stage execution circuit dissipates power and that the absence (i.e., a “hole”) of an instruction at any stage dissipates less power. By controlling where and when a hole occurs within the pipeline, the maximum average power dissipation of the processor is controlled.