摘要:
A method for operand width reduction is described, wherein two N-bit input operands (A, B) of a bit width of N are processed and two M-bit output operands (A′, B′) of a reduced bit width of M are generated in a way, that a post-processing comprising an M-bit adder function followed by saturation to M bits performed on said two M-bit output operands (A′, B′) provides an M-bit result equal to an M-bit result of an N-bit modulo adder function of the two N-bit input operands (A, B), followed by a saturation to M bits. Further an electronic computing circuit (1, 5) is described performing said method. Additionally a computer system comprising such an electronic computing circuit is described.
摘要:
A method and system for reducing power consumption when processing mathematical operations. Power may be reduced in processor hardware devices that receive one or more operands from an execution unit that executes instructions. A circuit detects when at least one operand of multiple operands is a zero operand, prior to the operand being forwarded to an execution component for completing a mathematical operation. When at least one operand is a zero operand or at least one operand is “unordered”, a flag is set that triggers a gating of a clock signal. The gating of the clock signal disables one or more processing stages and/or devices, which perform the mathematical operation. Disabling the stages and/or devices enables computing the correct result of the mathematical operation on a reduced data path. When a device(s) is disabled, the device may be powered off until the device is again required by subsequent operations.
摘要:
A binary logic unit to apply any Boolean operation on two input signals (va, vb) is described, wherein any Boolean operation to be applied on the input signals (va, vb) is defined by a particular combination of well defined control signals (ctl0, ctl1, ctl2, ctl3), wherein the input signals (va, vb) are used to select a control signal (ctl0, ctl1, ctl2, ctl3) as an output signal (vo) of the binary logic unit representing the result of a particular Boolean operation applied on the two input signals (va, vb). Furthermore a method to operate such a binary logic unit is described.
摘要:
In a binary floating point processor, the exponents of each of the various types of operands are recoded into an internal format, by biasing the exponents with the minimum exponent value of the result precision (“Emin”), i.e., the recoded value of the exponent is the represented value of the exponent minus Emin. Emin depends only on the result precision of the instruction that is currently being executed in the binary floating point processor. The exponent computations are then performed in this new format. The underflow check for all result precisions is a check against zero and overflow checks are performed against a positive number that depends on the result precision. The exponent values are in a 2's complement representation, so the underflow check simply becomes a check of the sign bit.
摘要:
A method, system and computer program product for reducing power consumption when processing mathematical operations. Power may be reduced in processor hardware devices that receive one or more operands from an execution unit that executes instructions. A circuit detects when at least one operand of multiple operands is a zero operand, prior to the operand being forwarded to an execution component for completing a mathematical operation. When at least one operand is a zero operand or at least one operand is “unordered”, a flag is set that triggers a gating of a clock signal. The gating of the clock signal disables one or more processing stages and/or devices, which perform the mathematical operation. Disabling the stages and/or devices enables computing the correct result of the mathematical operation on a reduced data path. When a device(s) is disabled, the device may be powered off until the device is again required by subsequent operations.
摘要:
A 3-stack floorplan for a floating point unit includes: an aligner located in the center of the floating point unit; a frontend located directly above the aligner; a multiplier located directly below the frontend and next to the aligner; an adder located directly next to the multiplier and directly below the aligner; a normalizer located directly above the adder; and a rounder located directly above the normalizer.
摘要:
The invention relates to a method for performing floating-point instructions within a processor of a data processing system is described, wherein an input of said floating-point instruction comprises a normal or a denormal floating-point number. Said method comprises the steps of storing said floating-point number, normalization of said floating-point number by counting the leading zeros of the mantissa, shifting the fraction part to the left by the number of leading zeros and simultaneously decrementing the exponent by one for every position that the fraction part is shifted to the left, wherein it the input is a normal floating point number the normalization is done after counting no leading zero of the mantissa, execution of a floating point instruction, wherein said normalized floating-point number is utilized as input for the floating point instruction, and storing of a floating-point result. Furthermore a processor to be used to perform said method is described.
摘要:
A floating point processor unit executes a floating point compare instruction with two operands of the same or different precision by comparing the two operands in integer format, which speeds up the execution of the floating point compare instruction significantly. The floating point processor now executes the floating point compare instruction at least twice as fast or faster (e.g., two clock cycles instead of five clock cycles in the prior art) for nearly most operand cases (e.g., 99% of all cases). Only the rare corner cases require additional operations on one of the operands and thus require additional cycles of execution time because the integer compare operation will not work for these corner cases. This is due to the fact that one operand is a single precision subnormal number in an unnormalized representation (i.e., has two representations) and the other operand is in the SP subnormal range such that the integer compare operation will fail.
摘要:
A semiconductor chip subdivided into power domains, at least one of the power domains is separately activated or deactivated and at least a part of the scannable storage elements are interconnected to one or more scan chains. At least one scan chain is serially subdivided into scan chain portions and the scan chain portion is arranged within one of the power domains. For at least one scan chain portion a bypass line is provided for passing by scan data and at least one select unit is provided for selecting between the bypass line and the corresponding scan chain portion in dependence of the activated or deactivated state of the corresponding power domains.
摘要:
A design structure to reduce power consumption within a clock gated synchronous circuit, said synchronous circuit comprising at least two successive stages, wherein each stage if activated propagates a data signal cycle by cycle to a succeeding stage the two successive stages comprising at least a control register, a data register and a local clock buffer (LCB) each, wherein each stage if activated propagates a data signal stored within the data register cycle by cycle to a data register of a succeeding stage.