-
公开(公告)号:US11036503B2
公开(公告)日:2021-06-15
申请号:US15236769
申请日:2016-08-15
Applicant: ARM LIMITED
Inventor: Gary Alan Gorman , Lee Evan Eisen , Neil Burgess , Daniel Arulraj
IPC: G06F9/30
Abstract: Processing circuitry selectively applies vector processing operations to one or more data items of one or more data vectors. Each data vector comprises a plurality of data items at respective vector positions in the data vector according to the state of respective predicate indicators associated with the vector positions. Predicate generation circuitry apply a processing operation to generate a set of predicate indicators, each associated with a respective one of the vector positions, to generate a count value indicative of the number of predicate indicators in the set having a given state, and to store the generated set of predicate indicators and the count value in a predicate store.
-
公开(公告)号:US10409604B2
公开(公告)日:2019-09-10
申请号:US15859931
申请日:2018-01-02
Applicant: Arm Limited
Inventor: Michael Alexander Kennedy , Neil Burgess
Abstract: An apparatus and method are provided for performing multiply-and-accumulate-products (MAP) operations. The apparatus has processing circuitry for performing data processing, the processing circuitry including an adder array having a plurality of adders for accumulating partial products produced from input operands. An instruction decoder is provided that is responsive to a MAP instruction specifying a first J-bit operand and a second K-bit operand, to control the processing circuitry to enable performance of a number of MAP operations, where the number is dependent on a parameter. For each performed MAP operation, the processing circuitry is arranged to generate a corresponding result element representing a sum of respective E×F products of E-bit portions within an X-bit segment of the first operand with F-bit portions within a Y-bit segment of the second operand, where E
-
23.
公开(公告)号:US10216479B2
公开(公告)日:2019-02-26
申请号:US15370660
申请日:2016-12-06
Applicant: ARM Limited
Abstract: An apparatus and method are provided for performing arithmetic operations to accumulate floating-point numbers. The apparatus comprises execution circuitry to perform arithmetic operations, and decoder circuitry to decode a sequence of instructions. A convert and accumulate instruction is provided, and the decoder circuitry is responsive to decoding the convert and accumulate instruction to generate one or more control signals to control the execution circuitry to convert at least one floating-point operand identified by the convert and accumulate instruction into a corresponding N-bit fixed-point operand having M fraction bits, where M is less than N and M is dependent on a format of the floating-point operand. The execution circuitry accumulates each corresponding N bit fixed-point operand and a P bit fixed-point operand identified by the convert and accumulate instruction in order to generate a P bit fixed-point result value, where P is greater than N and also has M fraction bits.
-
公开(公告)号:US09785407B2
公开(公告)日:2017-10-10
申请号:US14549639
申请日:2014-11-21
Applicant: ARM LIMITED
Inventor: Neil Burgess , David Raymond Lutz
CPC classification number: G06F7/535 , G06F7/5375 , G06F7/5525 , G06F9/3001 , G06F2207/5351 , G06F2207/5528
Abstract: A processing apparatus has combined divide-square root circuitry for performing a radix-N SRT divide algorithm and a radix-N SRT square root algorithm, where N is an integer power-of-2. The combined circuitry has shared remainder updating circuitry which performs remainder updates for a greater number of iterations per cycle for the SRT divide algorithm than for the SRT square root algorithm. This allows reduced circuit area while avoiding the SRT square root algorithm compromising the performance of the SRT divide algorithm.
-
公开(公告)号:US09733899B2
公开(公告)日:2017-08-15
申请号:US14939371
申请日:2015-11-12
Applicant: ARM LIMITED
Inventor: David Raymond Lutz , Neil Burgess , Christopher Neal Hinds
CPC classification number: G06F7/483 , G06F7/02 , G06F7/499 , G06F9/30014 , G06F9/30036 , G06F9/30192 , G06F9/3887 , G06F2207/483
Abstract: Processing circuitry performs a plurality of lanes of processing on respective data elements of at least one operand vector to generate corresponding result data elements of a result vector. The processing circuitry identifies lane position information for each lane of processing, the lane position information for a given lane identifying a relative position of the corresponding result data element to be generated by the given lane within a corresponding result data value spanning one or more result data elements of the result vector. The processing circuitry is configured to perform each lane of processing in dependence on the lane position information identified for that lane. This enables generation of results which are wider or narrower than the vector size supported in hardware.
-
公开(公告)号:US09710229B2
公开(公告)日:2017-07-18
申请号:US14728085
申请日:2015-06-02
Applicant: ARM LIMITED
Inventor: Neil Burgess , David Raymond Lutz
CPC classification number: G06F7/5525 , G06F7/483
Abstract: A data processing apparatus has a processing circuitry for performing a floating-point square root operation on a radicand value R to generate a result value. The processing circuitry has first square root processing circuitry for processing radicand values R which are not an exact power of two and second square root processing circuitry for processing radicand values which are an exact power of 2. Power-of-two detection circuitry detects whether the radicand value is an exact power of two and selects the output of the first or second square root processing circuitry as appropriate. This allows the result to be generated in fewer processing cycles when the radicand is a power of 2.
-
公开(公告)号:US09703531B2
公开(公告)日:2017-07-11
申请号:US14939469
申请日:2015-11-12
Applicant: ARM LIMITED
Inventor: David Raymond Lutz , Neil Burgess , Christopher Neal Hinds
CPC classification number: G06F7/5443 , G06F7/50 , G06F7/5324 , G06F7/5336
Abstract: A method is provided for multiplying a first operand comprising at least two X-bit portions and a second operand comprising at least one Y-bit portion. At least two partial products are generated, each partial product comprising a product of a selected X-bit portion of the first operand and a selected Y-bit portion of the second operand. Each partial product is converted to a redundant representation in dependence on significance indicating information indicative of a significance of the partial product. In the redundant representation, the partial product is represented using a number of N-bit portions, and in a group of at least two adjacent N-bit portions, a number of overlap bits of a lower N-bit portion of the group have a same significance as some least significant bits of at least one upper N-bit portion of the group. The partial products are added while represented in the redundant representation.
-
公开(公告)号:US09208839B2
公开(公告)日:2015-12-08
申请号:US14220490
申请日:2014-03-20
Applicant: ARM LIMITED
Inventor: David Raymond Lutz , Neil Burgess
CPC classification number: G11C7/1078 , G06F5/01 , G06F7/49921 , G06F9/30018 , G06F9/30032 , G11C7/1084
Abstract: Apparatus for data processing and a method of data processing are provided. Shift circuitry performs a shift operation in response to a shift instruction, shifting bits of an input data value in a direction specified by the shift instruction. Bit location indicator generation circuitry and comparison circuitry operate in parallel with the shift circuitry. The bit location indicator indicates at least one bit location in the input data value which must not have a bit set if the shifted data value is not to saturate. Comparison circuitry compares the bit location indicator with the input data value and indicates a saturation condition if any bits are indicated by the bit position indicator for bit locations which hold set bits in the input data value. A faster indication of the saturation condition thus results.
Abstract translation: 提供了数据处理装置和数据处理方法。 移位电路响应于移位指令执行移位操作,在由移位指令指定的方向上移位输入数据值的位。 位位置指示符生成电路和比较电路与移位电路并行操作。 位位置指示符指示输入数据值中的至少一个位位置,如果移位的数据值不饱和,则该位置不能有位置位。 比较电路将位位置指示符与输入数据值进行比较,并且如果位位置指示符指示了用于保持输入数据值中的设置位的位位置的任何位,则表示饱和条件。 因此,更快地显示饱和条件。
-
公开(公告)号:US11704092B2
公开(公告)日:2023-07-18
申请号:US17081068
申请日:2020-10-27
Applicant: Arm Limited
Inventor: Neil Burgess , Christopher Neal Hinds , David Raymond Lutz , Pedro Olsen Ferreira
CPC classification number: G06F7/49947 , G06F7/487 , G06F7/509 , G06F2207/3808 , G06F2207/3832
Abstract: An apparatus includes a processing circuit and a storage device. The processing circuit is configured to perform one or more processing operations in response to one or more instructions to generate an anchored-data element. The storage device is configured to store the anchored-data element. A format of the anchored-data element includes an identification item, an overlap item, and a data item. The data item is configured to hold a data value of the anchored-data element. The identification item indicates an anchor value for the data value or one or more special values.
-
30.
公开(公告)号:US11080054B2
公开(公告)日:2021-08-03
申请号:US15236728
申请日:2016-08-15
Applicant: ARM LIMITED
Inventor: Neil Burgess , Lee Evan Eisen , Gary Alan Gorman , Daniel Arulraj
IPC: G06F9/30
Abstract: Data processing apparatus comprises processing circuitry to selectively apply vector processing operations to one or more data items of one or more data vectors each comprising an ordered plurality of data items at respective vector positions in the data vector, according to the state of respective predicate indicators associated with the vector positions; predicate generation circuitry to apply a processing operation to generate an ordered set of predicate indicators, each associated with a respective one of the vector positions, the ordered set of predicate indicators being associated with an ordered set of active indicators each having an active or an inactive state; and a detector to detect a status flag indicative of whether a predicate indicator at a position, in the ordered set of predicate indicators, corresponding to the position of an outermost active indicator having the active state, has a given state; in which the detector comprises: first and second circuitry to combine the ordered set of predicate indicators and the ordered set of active indicators using first and second respective logical bit-wise combinations to generate first and second ordered sets of intermediate data; and arithmetic circuitry to combine the first and second ordered sets of intermediate data using an arithmetic combination generating a carry bit, the detector generating the status flag in dependence upon the carry bit.
-
-
-
-
-
-
-
-
-