COMPARISON OF WIDE DATA TYPES
    1.
    发明申请

    公开(公告)号:US20190087155A1

    公开(公告)日:2019-03-21

    申请号:US15743008

    申请日:2016-05-25

    Applicant: ARM Limited

    Inventor: Jørn NYSTAD

    Abstract: There is provided an apparatus and method for comparing wide data types. The apparatus comprises processing circuitry to perform a plurality of comparison operations in order to compare a first value and a second value, each of the first value and the second value having a length greater than N bits, and each comparison operation operating on a corresponding N bits of the first and second values. The plurality of comparison operations are chained to form a sequence such that each comparison operation is arranged to output an accumulated comparison result incorporating the comparison results of any previous comparison operations in the sequence, and such that for each comparison operation other than a final comparison operation in the sequence the accumulated comparison result is provided for use as an input by a next comparison operation in the sequence.

    ACCUMULATION OF FLOATING-POINT VALUES
    2.
    发明申请
    ACCUMULATION OF FLOATING-POINT VALUES 有权
    浮点值的累积

    公开(公告)号:US20160306608A1

    公开(公告)日:2016-10-20

    申请号:US15060778

    申请日:2016-03-04

    Applicant: ARM LIMITED

    Inventor: Jørn NYSTAD

    CPC classification number: G06F7/485 G06F7/49968 G06F7/49973

    Abstract: An apparatus and method for generating a sum of floating-point input values are provided. To sum the values multiple partial sum floating-point values are maintained and the partial sum to which an input value may be added is selected by a least significant portion of the exponent of the input value. If the exponent of the input value is equal to the exponent of the value stored in the selected partial sum a mantissa sum of the input value and stored partial sum value replaces the mantissa value of the selected partial sum value. If the exponent of the input value is larger than the exponent of the value stored in the selected partial sum the selected partial sum value is replaced with the input value. An associative and deterministic summation is thus provided.

    Abstract translation: 提供了一种用于产生浮点输入值之和的装置和方法。 为了对这些值求和,维持多个部分和浮点值,并且通过输入值的指数的最低有效部分来选择可以添加输入值的部分和。 如果输入值的指数等于存储在所选择的部分和值​​中的值的指数,则输入值和存储的部分和值​​的尾数之和代替所选择的部分和值​​的尾数值。 如果输入值的指数大于存储在所选择的部分和中的值的指数,则所选择的部分和值​​被替换为输入值。 因此提供了一个联想和确定性的求和。

    ENCODING INSTRUCTIONS IDENTIFYING FIRST AND SECOND ARCHITECTURAL REGISTER NUMBERS

    公开(公告)号:US20170212758A1

    公开(公告)日:2017-07-27

    申请号:US15003828

    申请日:2016-01-22

    Applicant: ARM LIMITED

    Abstract: Various encoding schemes are discussed for more efficiently encoding instructions which identify first and second architectural register numbers. In the first example, by constraining the first architectural register number to be greater than the second architectural register number, this frees up encodings for use in encoding other operations. In a second example, the first and second architectural register numbers may take any value but one of a first type of processing operation and a second type of processing operation is selected depending on a comparison of the first and second architectural register numbers.

    CLEANING A WRITE-BACK CACHE
    4.
    发明申请
    CLEANING A WRITE-BACK CACHE 审中-公开
    清除写回高速缓存

    公开(公告)号:US20160179676A1

    公开(公告)日:2016-06-23

    申请号:US14957117

    申请日:2015-12-02

    Applicant: ARM LIMITED

    Abstract: A data processing system incorporates a write-back cache and supports load-and-clean program instructions. The action of a load-and-clean program instruction is to load a data value and to mark as clean at least a target portion within a cache line of the write-back cache which is storing the data value loaded. The data values to be subject to such load-and-clean instructions may be identified by the programmer as the last use of those data values, or may be identified by a compiler as the last use of those data values. The data values may be from a stack memory region in which their pattern of access is predictable and it is known when they are no longer required. Another example of regular memory accesses where the last access can be identified is when processing streaming media data.

    Abstract translation: 数据处理系统包含回写高速缓存并支持加载和清理程序指令。 加载和清理程序指令的动作是加载数据值,并标记为清除正在存储加载数据值的回写缓存的高速缓存行内的至少一个目标部分。 要受这种加载和清理指令的数据值可以由程序员识别为最后使用那些数据值,或者可以由编译器识别为这些数据值的最后使用。 数据值可以来自其存储模式是可预测的堆栈存储器区域,并且当不再需要它们时,它们是已知的。 可以识别最后一次访问的常规内存访问的另一个例子是处理流媒体数据时。

    FORWARD KILLING OF THREADS CORRESPONDING TO GRAPHICS FRAGMENTS OBSCURED BY LATER GRAPHICS FRAGMENTS

    公开(公告)号:US20190088009A1

    公开(公告)日:2019-03-21

    申请号:US16128807

    申请日:2018-09-12

    Applicant: ARM Limited

    Abstract: A graphics processing apparatus comprises fragment generating circuitry to generate graphics fragments corresponding to graphics primitives, thread processing circuitry to perform threads of processing corresponding to the fragments, and forward kill circuitry to trigger a forward kill operation to prevent further processing of a target thread of processing corresponding to an earlier graphics fragment when the forward kill operation is enabled for the target thread and the earlier graphics fragment is determined to be obscured by one or more later graphics fragments. The thread processing circuitry supports enabling of the forward kill operation for a thread including at least one forward kill blocking instruction having a property indicative that the forward kill operation should be disabled for the given thread, when the thread processing circuitry has not yet reached a portion of the thread including the at least one forward kill blocking instruction.

    APPARATUS AND METHOD FOR INHIBITING ROUNDOFF ERROR IN A FLOATING POINT ARGUMENT REDUCTION OPERATION
    6.
    发明申请
    APPARATUS AND METHOD FOR INHIBITING ROUNDOFF ERROR IN A FLOATING POINT ARGUMENT REDUCTION OPERATION 有权
    浮动点减少运算中抑制误差的装置和方法

    公开(公告)号:US20160364209A1

    公开(公告)日:2016-12-15

    申请号:US15140739

    申请日:2016-04-28

    Applicant: ARM LIMITED

    Inventor: Jørn NYSTAD

    CPC classification number: G06F7/49915 G06F7/483

    Abstract: An apparatus and method are provided for inhibiting roundoff error in a floating point argument reduction operation. The apparatus has reciprocal estimation circuitry that is responsive to a first floating point value to determine a second floating point value that is an estimated reciprocal of the first floating point value. During this determination, the second floating point value has both its magnitude and its error bound constrained in dependence on a specified value N. Argument reduction circuitry then performs an argument reduction operation using the first and second floating point values as inputs, in order to generate a third floating point value. The use of the specified value N to constrain both the magnitude and the error bound of the second floating point value causes roundoff error to be inhibited in the third floating point value that is generated by the argument reduction operation. This enables such an argument reduction operation to be used as part of a more complex computation, such as a logarithm computation, with the inhibiting of roundoff error in the argument reduction result allowing the overall result to exhibit small relative error across the whole representable input range.

    Abstract translation: 提供了一种用于在浮点自变量减少操作中抑制舍入误差的装置和方法。 该装置具有相互估计电路,其响应于第一浮点值来确定作为第一浮点值的估计倒数的第二浮点值。 在该确定期间,第二浮点值具有依赖于指定值N的约束的其幅度和误差范围。因此,参数减小电路使用第一和第二浮点值作为输入执行自变量减少运算,以便产生 第三个浮点值。 通过使用规定值N来限制第二浮点值的大小和误差界限,导致由参数缩小运算生成的第三浮点值中的舍入误差被禁止。 这使得这种参数缩减操作能够被用作更复杂的计算的一部分,例如对数计算,其中抑制参数缩减结果中的舍入误差允许整个结果在整个可表示的输入范围内表现出小的相对误差 。

    ADDITION CIRCUITRY
    7.
    发明公开
    ADDITION CIRCUITRY 审中-公开

    公开(公告)号:US20240296011A1

    公开(公告)日:2024-09-05

    申请号:US18117210

    申请日:2023-03-03

    Applicant: Arm Limited

    Inventor: Jørn NYSTAD

    CPC classification number: G06F7/508

    Abstract: Addition circuitry performs a saturating addition of a first number and a second number to generate a result value indicating an addition result corresponding to addition of the first number and the second number when the addition result is within a predetermined range and indicating a saturation value when the addition result is outside the predetermined range. The addition circuitry comprises: saturation lookahead circuitry to determine, for each lane of the result value, a respective set of one or more saturation lookahead status indications indicative of whether that lane should be set to represent part of the saturation value; and addition result generating circuitry to generate result bits for each lane, with a given lane of the result value having a value determined as a function of corresponding bits of the first and second numbers and a corresponding set of one or more saturation lookahead status indications determined for that lane by the saturation lookahead circuitry.

    APPARATUS AND METHOD FOR PERFORMING DIVISION
    8.
    发明申请
    APPARATUS AND METHOD FOR PERFORMING DIVISION 审中-公开
    用于执行部门的装置和方法

    公开(公告)号:US20170010862A1

    公开(公告)日:2017-01-12

    申请号:US15168436

    申请日:2016-05-31

    Applicant: ARM LIMITED

    Inventor: Jørn NYSTAD

    CPC classification number: H03K19/20 G06F7/535

    Abstract: An apparatus and method are provided, the apparatus comprising: storage circuitry to store an input data value; divider circuitry to split the input data value into at least one sub-value in dependence on a number of lanes for a current iteration, each sub-value occupying a lane, and to operate on each sub-value to generate a quotient corresponding to the division of that sub-value by a divisor, wherein the divisor is an odd integer; remainder circuitry to operate on each sub-value to generate a remainder corresponding to the remainder of dividing that sub-value by the divisor; concatenation circuitry to concatenate each quotient to produce a concatenated division value, and to concatenate each remainder to produce a concatenated remainder value, in each subsequent iteration, the input data value being formed from the concatenated remainder value of a preceding iteration; and output circuitry to output, after a plurality of iterations, a result of adding the concatenated division values produced by said plurality of iterations.

    Abstract translation: 提供了一种装置和方法,该装置包括:存储电路,用于存储输入数据值; 分配器电路,用于根据用于当前迭代的多个通道将输入数据值分解为至少一个子值,每个子值占据通道,并且对每个子值进行操作以生成对应于 由除数除以该子值,其中除数为奇整数; 剩余电路,用于对每个子值进行操作以产生与除数除以该子值的剩余部分相对应的余数; 级联电路,用于连接每个商产生连接的分割值,并且在每个后续迭代中,连续化每个余数以产生连接的余数值,所述输入数据值由前一次迭代的级联余数值形成; 以及输出电路,在多次迭代之后,输出由所述多次迭代产生的级联分割值相加的结果。

Patent Agency Ranking