摘要:
A programmable latency (a programmable number of clock cycles) needed for an operation completion. The required latency for a pipe is determined from a formula including the system clock cycle time which the unit will be specified to operate under. The latency is preprogrammed by setting the count of a timer accordingly to provide at least the minimum number of clock cycles necessary to cover the time required to do the computation. Separate timers are independently set for arithmetic logic unit (ALU) operations, multiply operations, logical operations and divide and square root operations.
摘要:
The present invention optimizes the number and ratio of cycles required among the divide/square root unit, multiplier unit and ALU. An intermediate latch with its own clock is provided at the output of the multiplier half-array in the intermediate stage to feed back data for a second pass for double-precision numbers. The multiplier can then be adjusted for either two-cycle latency mode (for optimizing double-precision multiplies) or three-cycle latency mode (for optimizing single-precision multiplies). A separate divide clock is used for the divide/square root unit, and is synchronized with the multiplier cycle clock on input and output. This allows the divide time to be optimized so that it requires fewer clock cycles when a longer multiplier clock cycle time is used.
摘要:
A method and apparatus for combining the multiply and ALU functions for floating point numbers to enable the completion of a multiply-accumulate operation in a shorter time. The multiplied fraction is left in sum and carry form and is provided in this form to the ALU, eliminating the CP adder from the multiplier. The normalization of the fraction and the corresponding changes to the exponent in the multiplier are also eliminated. The ALU can combine the sum and carry of the product fraction simultaneously if the exponents are sufficiently similar. Otherwise, the sum and carry of the fraction product is combined first and compared with the new fraction, with the smaller of the fractions being right shifted prior to their combination.