Abstract:
Determining a table output of a table representing a hierarchical tree for an integer valued function includes determining an address from a table input. A subset of a memory is selected according to the address, where the memory represents the hierarchical tree and the subset represents a subtree of the hierarchical tree. Bit fields are selected from the subset, and bits are extracted from the bit fields. A table output is determined from the extracted bits.
Abstract:
An apparatus and method are disclosed for minimizing accumulated rounding errors in coefficient values in a lookup table for interpolating polynomials. Unlike prior art methods that individually round each polynomial coefficient of a function, the method of the present invention use a “ripple carry” rounding method to round each coefficient using information from the previously rounded coefficient. The “ripple carry” method generates rounded coefficients that significantly improve the total rounding error for the function.
Abstract:
The division and square root systems include a multiplier. The systems also include a multipartite table system, a folding inverter, and a complement inverter, each coupled to the multiplier. The division and square root functions can be performed using three scaling iterations. The system first determines both a first and a second scaling value. The first scaling value is a semi-complement term computed using the folding inverter to invert selected bits of the input. The second scaling value is a table lookup value obtained from the multipartite table system. In the first iteration, the system scales the input by the semi-complement term. In the second iteration, the resulting approximation is scaled by a function of the table lookup value. In the third iteration, the approximation is scaled by a value obtained from a function of the semi-complement term and the table lookup value. After the third iteration, the approximation is available for rounding.
Abstract:
A method and apparatus for performing the square root function which first comprises approximating the short reciprocal of the square root of the operand. A reciprocal bias adjustment factor is added to the approximation and the result truncated to form a correctly biased short reciprocal. The short reciprocal is then multiplied by a predetermined number of the most significant bits of the operand and the product is appropriately truncated to generate a first root digit value. The multiplication takes place in a multiplier array having a rectangular aspect ratio with the long side having a number of bits essentially as large as the number of bits required for the desired full precision root. The short side of the multiplier array has a number of bits slightly greater by several guard bits than the number of bits required for a single root digit value, which is also determined to be the number of bits in the short reciprocal. The root digit value is squared and the exact square is subtracted from the operand to yield an exact remainder. Succeeding new root digit values are determined by multiplying the short reciprocal by the appropriately shifted current remainder, selectively adding a digit bias adjustment factor and truncating the product. The root digit values are appropriately shifted and accumulated to form a partial root. The described steps are repeated to serially generate root digit values and partial roots with corresponding new exact remainders.
Abstract:
Methods for determining the square root, reciprocal square root, or reciprocal of a number performed by a processor of a computer system. The methods produce high precision estimates without using iterative steps. In addition, the methods taught herein utilize compressed tables for the coefficient terms A, B, and C from the quadratic expression Ax2+Bx+C, thus minimizing hardware requirements.
Abstract translation:用于确定由计算机系统的处理器执行的数字的平方根,倒数平方根或倒数的方法。 该方法在不使用迭代步骤的情况下产生高精度估计。 另外,本文教导的方法利用来自二次表达式Ax 2 + B x + C的系数项A,B和C的压缩表,从而最小化硬件要求。
Abstract:
An early no-overflow signaling system and method is used in conjunction with performing nonrestoring division using two's complement 2n bit dividends N and two's complement n bit divisors D--when a no-overflow condition is signaled, a subsequent plurality of iterative partial remainder computations are performed to obtain the quotient Q and remainder R with no possibility of overflow. Dividends N are characterized by a 2-bit sign field N(s1s2) formed by a first sign bit N(s1) and a second sign bit N(s2), a high order n-1 dividend magnitude bits N(himag), and a low order n-1 dividend magnitude bits N(lomag), such that N(s1) and N(himag) form a 2's complement number N(hi), while divisors D are characterized by a leading sign bit D(s) and n-1 divisor magnitude bits D(mag). Early no-overflow signaling logic uses the input dividend N and divisor D, and a 2n-1 bit first partial remainder (which has a value of [N-2.sup.n-1 D]) obtained by computing an n-bit first partial remainder PR1 corresponding to the first n bits of the first partial remainder of value [N-2.sup.n-1 D] (including a leading sign bit PR1(s)), such that the first partial remainder of value [N-2.sup.n-1 D] corresponds to PR1 and N(lomag). No-overflow signaling (illustrated in FIGS. 2a/2b and 4) uses (i) the divisor sign and magnitude D(s) and D(mag), (ii) the two bit sign field of the dividend N(s1s2), (iii) and the first partial remainder of value [N-2.sup.n-1 D]. A no-overflow condition is signaled if (i) the divisor magnitude D(mag) is not equal to zero (FIG. 2a, 102, and FIG. 4, 151), and (ii) the dividend sign bits N(s1) and N(s2) are equal (FIG. 2a, 112, and FIG. 4 , 152), and (iii) the sign of the first partial remainder PR1(s) in not equal to the dividend sign bit N(s2) (FIG. 2b, 131, and FIG. 4, 153), and (iv) the divisor and dividend are not both negative (FIG. 2b, 141, and FIG. 4, 154, 156), or if they are, (v) the first partial remainder corresponding to PR1 and N(lomag) is not equal to zero (FIG. 2b, 141, 142, 143, and FIG. 2b, 155, 156).
Abstract:
An arithmetic circuit 10 for performing prescaled division uses a rectangular multiplier 16 and accumulator 30 operable to calculate a short reciprocal and scaled dividend and divisor to enable the sequential iterative calculation of large radix quotient digits. Each quotient digit can be calculated using a single pass through the rectangular multiplier 16 and accumulator 30 and can be accumulated to form a full precision quotient in a quotient register 36.