Abstract:
The present embodiments relate to performing reduced-precision floating-point arithmetic operations using specialized processing blocks with higher-precision floating-point arithmetic circuitry. A specialized processing block may receive four floating-point numbers that represent two single-precision floating-point numbers, each separated into an LSB portion and an MSB portion, or four half-precision floating-point numbers. A first partial product generator may generate a first partial product of first and second input signals, while a second partial product generator may generate a second partial product of third and fourth input signals. A compressor circuit may generate carry and sum vector signals based on the first and second partial products; and circuitry may anticipate rounding and normalization operations by generating in parallel based on the carry and sum vector signals at least two results when performing the single-precision floating-point operation and at least four results when performing the two half-precision floating-point operations.
Abstract:
A floating point adder includes leading zero anticipation circuitry to determine a number of leading zeros within a result significand value of a sum of a first floating point operand and a second floating point operand. This number of leading zeros is used to generate a mask which in turn selects input bits from a non-normalized significand produced by adding the first significand value and the second significand value. The non-normalized significand is then normalized at the same time as the output rounding bits used to round the normalized significand value are generated by rounding bit generation circuitry.
Abstract:
An apparatus comprises processing circuitry to perform a conversion operation to convert a vector comprising a plurality of data elements representing respective bit significance portions of a binary value to a scalar value comprising an alternative representation of said binary value.
Abstract:
An arithmetic processing device includes: a first memory configured to store values of a first coefficient of a logarithmic function, where the logarithmic function is decomposed into a series operation term and the coefficient term, depending on respective values of a first bit group included in operand data of a first instruction to calculate the value of the first coefficient; a second memory configured to store values of a second coefficient included in the series operation term depending on the respective values of the first bit group included in operand data of a second instruction to calculate the value of the second coefficient; and a selector configured to select the value of the first coefficient read from the first memory based on the execution of the first instruction and select the value of the second coefficient read from the second memory based on the execution of the second instruction.
Abstract:
An apparatus comprises processing circuitry to perform a conversion operation to convert a vector comprising a plurality of data elements representing respective bit significance portions of a binary value to a scalar value comprising an alternative representation of said binary value.
Abstract:
A data processing system uses alignment circuitry to align input operands in accordance with a programmable significance parameter to form aligned input operands. The aligned input operands are supplied to arithmetic circuitry, such as an integer adder or an integer multiplier, where a result value is formed. The result value is stored in an output operand storage element, such as a result register. The programmable significance parameter is independent of the result value.
Abstract:
An apparatus may have processing circuitry to perform one or more arithmetic operations for generating a result value based on at least one operand. For at least one arithmetic operation, the processing circuitry is responsive to programmable significance data indicative of a target significance for the result value, to generate the result value having the target significance. For example, this allows programmers to set a significance boundary for the arithmetic operation so that it is not necessary for the processing circuitry to calculate bit values having a significance outside the specified boundary, enabling a performance improvement.
Abstract:
An integrated circuit is provided that performs floating-point addition or subtraction operations involving at least three floating-point numbers. The floating-point numbers are pre-processed by dynamically extending the number of mantissa bits, determining the floating-point number with the biggest exponent, and shifting the mantissa of the other floating-point numbers to the right. Each extended mantissa has at least twice the number of bits of the mantissa entering the floating-point operation. The exact bit extension is dependent on the number of floating-point numbers to be added. The mantissas of all floating-point numbers with an exponent smaller than the biggest exponent are shifted to the right. The number of right shift bits is dependent on the difference between the biggest exponent and the respective floating-point exponent.
Abstract:
A processor includes: an exponent generating unit that generates an exponent part of a coefficient represented by a floating point number format based on a first part of received input data, the coefficient being obtained when an exponential function is decomposed into a series operation and the coefficient for the series operation; a storage unit that stores a mantissa part of the coefficient; a constant generating unit that reads constant data corresponding to a second part of the input data from the storage unit; and a selecting unit that selects and outputs the constant data from the constant generating unit when an instruction to be executed is a coefficient calculation instruction for calculation of the coefficient of the exponential function.
Abstract:
A logic circuit computes various modal interval arithmetic values using a plurality of arithmetic function units. A multiplexer gates the desired arithmetic values to a storage register.