Abstract:
A data processing apparatus and method of operating such a data processing apparatus are provided, for responding to a division instruction to perform a division operation to generate a result value by dividing an input numerator specified by the division instruction by an input denominator specified by the division instruction. The input numerator and input denominator are binary values. The apparatus comprises division circuitry configured to generate the result value by carrying out the division operation, power-of-two detection circuitry configured to signal a bypass condition if the input denominator has a value given by ±2N, where N is a positive integer, and bypass circuitry configured, in response to signalling of the bypass condition, to cause the division circuitry to be bypassed and to cause the result value to be generated as the input numerator shifted by N bits.
Abstract:
A data processing apparatus and method are provided for multiplying first and second normalized floating point operands in order to generate a result, each normalized floating point operand comprising a significand and an exponent. Exponent determination circuitry is used to compute a result exponent for a normalized version of the result, and rounding value generation circuitry then generates a rounding value by shifting a rounding constant in a first direction by a shift amount that is dependent on the result exponent. Partial product generation circuitry multiplies the significands of the first and second normalized floating point operands to generate the first and second partial products, and the first and second partial products are then added together, along with the rounding value, in order to generate a normalized result significand. Thereafter, the normalized result significand is shifted in a second direction opposite to the first direction, by the shift amount, in order to generate a rounded result significand. This provides a particularly efficient mechanism for multiplying floating point numbers, while correctly rounding the result in situations where the result is subnormal.
Abstract:
A data processing apparatus has permutation circuitry for performing a permutation operation for changing a data element size or data element positioning of at least one source operand to generate first and second SIMD operands, and SIMD processing circuitry for performing a SIMD operation on the first and second SIMD operands. In response to a first SIMD instruction requiring a permutation operation, the instruction decoder controls the permutation circuitry to perform the permutation operation to generate the first and second SIMD operands and then controls the SIMD processing circuitry to perform the SIMD operation using these operands. In response to a second SIMD instruction not requiring a permutation operation, the instruction decoder controls the SIMD processing circuitry to perform the SIMD operation using the first and second SIMD operands identified by the instruction, without passing them via the permutation circuitry.
Abstract:
An apparatus includes a processing circuit and a storage device. The processing circuit is configured to perform one or more processing operations in response to one or more instructions to generate an anchored-data element. The storage device is configured to store the anchored-data element. A format of the anchored-data element includes an identification item, an overlap item, and a data item. The data item is configured to hold a data value of the anchored-data element. The identification item indicates an anchor value for the data value or one or more special values.
Abstract:
An apparatus and method are provided for performing an index operation. The apparatus has vector processing circuitry to perform an index operation in each of a plurality of lanes of parallel processing. The index operation requires an index value opm to be multiplied by a multiplier value e to produce a multiplication result. The number of lanes of parallel processing is dependent on a specified element size, and the multiplier value is different, but known, for each lane of parallel processing. The vector processing circuitry comprises mapping circuitry to perform, within each lane, mapping operations on the index value opm in order to generate a plurality of intermediate input values. The plurality of intermediate input values are such that the addition of the plurality of intermediate input values produces the multiplication result. Within each lane the mapping operations are determined by the multiplier value used for that lane. The vector processing circuitry also has vector adder circuitry to perform, within each lane, an addition of at least the plurality of intermediate input values, in order to produce a result vector providing a result value for the index operation performed in each lane. This provides a high performance, low latency, technique for vectorising index operations.
Abstract:
An apparatus is provided, that includes an instruction decoder responsive to an anchored-data processing instruction, to generate one or more control signals. Conversion circuitry is responsive to the one or more control signals to perform a conversion from a data value to an anchored-data select value. The conversion is based on anchor metadata indicative of a given range of significance for the anchored-data select value. Output circuitry is responsive to the one or more control signals, to write the anchored-data select value to a register.
Abstract:
Apparatus and a corresponding method are disclosed relating to circuitry to perform an arithmetic operation on one or more input operands, where the circuitry is responsive to an equivalence of a result value of the arithmetic operation with at least one of the one or more input operands, when the one or more input operands are not an identity element for the arithmetic operation, to generate a signal indicative of the equivalence. Idempotency (between at least one input operand and the result value) is thus identified.
Abstract:
An apparatus and method for floating-point multiplication are provided. Two partial products are generated from two operand significands. An unbiased result exponent is determined from operand exponent values and leading zero counts, and a shift amount and direction for a product significand as needed for a predetermined minimum exponent value of a predetermined canonical format. First and second rounding values for injection into addition of the partial products are generated by shifting a predetermined rounding pattern by the shift amount in an opposite shift direction for the first rounding value and left shifting by one bit the first rounding value to give the second. The first and second partial products are added together with the first rounding value to give a first product significand, and are added together with the second rounding value to give a second product significand. These product significands are shifted by the shift amount in the shift direction and one is then selected in order to generate a formatted significand in the predetermined canonical format. The early injection rounding provides a faster floating-point multiplier.
Abstract:
An apparatus comprises processing circuitry to perform a conversion operation to convert a floating-point value to a vector comprising a plurality of data elements representing respective bit significance portions of a binary value corresponding to the floating-point value.
Abstract:
A data processing system supports vector operands with components representing different bit significance portions of an integer number. Processing circuitry performs a processing operation specified by a program instruction in dependence upon a number of components comprising the vector as specified by metadata for the vector.