Vector generating instruction for generating a vector comprising a sequence of elements that wraps as required

    公开(公告)号:US11714641B2

    公开(公告)日:2023-08-01

    申请号:US16471185

    申请日:2017-11-08

    Applicant: ARM LIMITED

    Abstract: An apparatus and method are provided for performing vector processing operations. In particular the apparatus has processing circuitry to perform the vector processing operations and an instruction decoder to decode vector instructions to control the processing circuitry to perform the vector processing operations specified by the vector instructions. The instruction decoder is responsive to a vector generating instruction identifying a scalar start value and wrapping control information, to control the processing circuitry to generate a vector comprising a plurality of elements. In particular, the processing circuitry is arranged to generate the vector such that the first element in the plurality is dependent on the scalar start value, and the values of the plurality of elements follow a regularly progressing sequence that is constrained to wrap as required to ensure that each value is within bounds determined from the wrapping control information. The vector generating instruction can be useful in a variety of situations, a particular use case being to implement a circular addressing mode within memory, where the vector generating instruction can be coupled with an associated vector memory access instruction. Such an approach can remove the need to provide additional logic within the memory access path to support such circular addressing.

    Overlapped-immediate/register-field-specifying instruction

    公开(公告)号:US11099848B1

    公开(公告)日:2021-08-24

    申请号:US16776730

    申请日:2020-01-30

    Applicant: Arm Limited

    Inventor: Neil Burgess

    Abstract: An apparatus comprises: processing circuitry, an instruction decoder, and registers. In response to an overlapped-immediate/register-field-specifying (OIRFS) instruction comprising an opcode field specifying an OIRFS-indicating opcode value, and an overlapped immediate/register field specifying an immediate value and a register specifier, the instruction decoder controls the processing circuitry to use a selected register of the plurality of registers corresponding to the register specifier as a source register or destination register when performing a processing operation depending on the immediate value. The overlapped immediate/register field includes at least one shared bit decoded as part of the immediate value for at least one encoding of the OIRFS instruction and decoded as part of the register specifier for at least one encoding of the OIRFS instruction.

    Overflow or underflow handling for anchored-data value

    公开(公告)号:US10936285B2

    公开(公告)日:2021-03-02

    申请号:US16268692

    申请日:2019-02-06

    Applicant: Arm Limited

    Abstract: Processing circuitry may support processing of anchor-data values comprising one or more anchored-data elements which represent portions of bits of a two's complement number. The anchored-data processing may depend on anchor information indicating at least one property indicative of a numeric range representable by the result anchored-data element or the anchored-data value. When the operation causes an overflow or an underflow, usage information may be stored indicating a cause of the overflow or underflow and/or an indication of how to update the anchor information and/or number of elements in the anchored-data value to prevent the overflow or underflow. This can support dynamic range adjustment in software algorithms which involve anchored-data processing.

    Apparatus and method for processing input operand values

    公开(公告)号:US10579338B2

    公开(公告)日:2020-03-03

    申请号:US15833372

    申请日:2017-12-06

    Applicant: ARM Limited

    Abstract: An apparatus and method are provided for processing input operand values. The apparatus has a set of vector data storage elements, each vector data storage element providing a plurality of sections for storing data values. A plurality of lanes are considered to be provided within the set of storage elements, where each lane comprises a corresponding section from each vector data storage element. Processing circuitry is arranged to perform an arithmetic operation on an input operand value comprising a plurality of portions, by performing an independent arithmetic operation on each of the plurality of portions, in order to produce a result value comprising a plurality of result portions. Storage circuitry is arranged to store the result value within a selected lane of the plurality of lanes, such that each result portion is stored in a different vector data storage element within the corresponding section for the selected lane. Such an approach allows efficient processing of input operand values in a manner that is not constrained by the size of the vector data storage elements, and in particular in a way that is vector length agnostic.

    Multiply-and-accumulate-products instructions

    公开(公告)号:US10409592B2

    公开(公告)日:2019-09-10

    申请号:US15494946

    申请日:2017-04-24

    Applicant: ARM LIMITED

    Abstract: An apparatus has processing circuitry comprising an L×M multiplier array. An instruction decoder associated with the processing circuitry supports a multiply-and-accumulate-product (MAP) instruction for generating at least one result element corresponding to a sum of respective E×F products of E-bit and F-bit portions of J-bit and K-bit operands respectively, where 1

    Bit processing
    46.
    发明授权

    公开(公告)号:US10366741B2

    公开(公告)日:2019-07-30

    申请号:US15711116

    申请日:2017-09-21

    Applicant: ARM LIMITED

    Abstract: Circuitry comprises: a set of bit processing circuitries to apply two or more successive instances of bitwise processing to an ordered bit array; each bit processing circuitry for a given bit position within the ordered bit array comprising: bit shifting circuitry to selectively apply a bit shift of a respective input bit to a next bit processing circuitry in a first direction relative to the ordered bit array, in response to an active state of a bit shift control signal, the bit shifting circuitry not applying the bit shift in response to an inactive state of the bit shift control signal; and bit shift control circuitry to selectively allow or inhibit a bit shifting operation in response to one or more inhibit control signals; in which: the bit shift control circuitry is configured to selectively propagate an output inhibit control signal, indicating that a bit shifting operation should be inhibited, as an inhibit control signal to bit processing circuitry applying a next instance of the bitwise processing at the given bit position, in dependence upon the bit shift control signal and the one or more inhibit control signals.

    Apparatus and method for supporting a conversion instruction

    公开(公告)号:US10310809B2

    公开(公告)日:2019-06-04

    申请号:US15093947

    申请日:2016-04-08

    Applicant: ARM LIMITED

    Abstract: A data processing system includes instruction decoder circuitry responsive to a conversion instruction FCVTJS to convert a double precision floating point number into a 32-bit integer number. Right shifting circuitry performs a right shift upon at least part of the input number and left shifting circuitry performs a left shift of at least part of the input number. Selection circuitry serves to select one of the right shifted number and the left shifted number as a selected shifted number which forms at least part of the output number which is generated.

    Multiply adder
    50.
    发明授权

    公开(公告)号:US09696964B2

    公开(公告)日:2017-07-04

    申请号:US14566981

    申请日:2014-12-11

    Applicant: ARM LIMITED

    CPC classification number: G06F7/5443 G06F7/483

    Abstract: A floating point multiply add circuit 24 includes a multiplier 26 and an adder 28. The input operands A, B and C together with the result value all have a normal exponent value range, such as a range consistent with the IEEE Standard 754. The product value which is passed from the multiplier 26 to the adder 28 as an extended exponent value range that extents lower than the normal exponent value range. Shifters 48, 50 within the adder can take account of the extended exponent value range of the product as necessary in order to bring the result value back into the normal exponent value range.

Patent Agency Ranking