MATH PROCESSING BY DETECTION OF ELEMENTARY VALUED OPERANDS
    1.
    发明申请
    MATH PROCESSING BY DETECTION OF ELEMENTARY VALUED OPERANDS 有权
    通过检测元数值操作进行数学处理

    公开(公告)号:US20150095394A1

    公开(公告)日:2015-04-02

    申请号:US14040370

    申请日:2013-09-27

    CPC classification number: G06F7/50 G06F7/5443 G06F2207/3884

    Abstract: One embodiment of the present invention includes a method for simplifying arithmetic operations by detecting operands with elementary values such as zero or 1.0. Computer and graphics processing systems perform a great number of multiply-add operations. In a significant portion of these operations, the values of one or more of the operands are zero or 1.0. By detecting the occurrence of these elementary values, math operations can be greatly simplified, for example by eliminating multiply operations when one multiplicand is zero or 1.0 or eliminating add operations when one addend is zero. The simplified math operations resulting from detecting elementary valued operands provide significant savings in overhead power, dynamic processing power, and cycle time.

    Abstract translation: 本发明的一个实施例包括一种通过检测具有零或1.0等基本值的操作数简化算术运算的方法。 计算机和图形处理系统执行大量的多重加法操作。 在这些操作的重要部分中,一个或多个操作数的值为零或1.0。 通过检测这些基本值的出现,可以大大简化数学运算,例如通过在一个被乘数为零或1.0时消除乘法运算,或者当一个加数为零时消除加法运算。 检测基本值操作数导致的简化数学运算能够显着节省架空功耗,动态处理能力和循环时间。

    TECHNIQUE FOR PERFORMING ARBITRARY WIDTH INTEGER ARITHMETIC OPERATIONS USING FIXED WIDTH ELEMENTS
    2.
    发明申请
    TECHNIQUE FOR PERFORMING ARBITRARY WIDTH INTEGER ARITHMETIC OPERATIONS USING FIXED WIDTH ELEMENTS 有权
    使用固定宽度元素执行仲裁宽整数算术运算的技术

    公开(公告)号:US20150081753A1

    公开(公告)日:2015-03-19

    申请号:US14026829

    申请日:2013-09-13

    CPC classification number: G06F7/525 G06F2207/3824

    Abstract: One embodiment of the present invention includes a method for performing arithmetic operations on arbitrary width integers using fixed width elements. The method includes receiving a plurality of input operands, segmenting each input operand into multiple sectors, performing a plurality of multiply-add operations based on the multiple sectors to generate a plurality of multiply-add operation results, and combining the multiply-add operation results to generate a final result. One advantage of the disclosed embodiments is that, by using a common fused floating point multiply-add unit to perform arithmetic operations on integers of arbitrary width, the method avoids the area and power penalty of having additional dedicated integer units.

    Abstract translation: 本发明的一个实施例包括使用固定宽度元素对任意宽度整数执行算术运算的方法。 该方法包括接收多个输入操作数,将每个输入操作数分割成多个扇区,基于多个扇区执行多个乘法运算,生成多个乘法运算结果,并组合乘法运算结果 以产生最终结果。 所公开的实施例的一个优点是,通过使用公共融合浮点乘法单元对任意宽度的整数执行算术运算,该方法避免了具有附加专用整数单位的面积和功率损失。

    EFFICIENCY IN A FUSED FLOATING-POINT MULTIPLY-ADD UNIT
    3.
    发明申请
    EFFICIENCY IN A FUSED FLOATING-POINT MULTIPLY-ADD UNIT 审中-公开
    熔融浮点添加单元的效率

    公开(公告)号:US20150193203A1

    公开(公告)日:2015-07-09

    申请号:US14149647

    申请日:2014-01-07

    CPC classification number: G06F7/5443 G06F7/483 G06F7/5336

    Abstract: A four cycle fused floating point multiply-add unit includes a radix 8 Booth encoder multiplier that is partitioned over two stages with the compression element allocated to the second stage. The unit further includes an improved shifter design. Processing logic analyzes the input operands, detects values of zero and one, and inhibits portions of the processing logic accordingly. When one of the multiplicand inputs has a value of zero or one, the required multiplication becomes trivial, and the unit inhibits the associated coding logic and data transfer to reduce power consumption. The unit then performs an add-only operation. When the addend input has a value of zero, the addition becomes trivial, and the unit inhibits the improved shifter and data transfer to further reduce power consumption. The unit then performs a multiply-only operation.

    Abstract translation: 四循环融合浮点乘法单元包括一个基数8布斯编码器乘法器,其在压缩元件分配给第二阶段的两个阶段上被划分。 该单元还包括改进的换档器设计。 处理逻辑分析输入操作数,检测零和一的值,并相应地禁止处理逻辑的部分。 当其中一个被乘数输入值为零或1时,所需的乘法变得微不足道,并且该单元禁止相关编码逻辑和数据传输以降低功耗。 该单元然后执行加法运算。 当加数输入的值为零时,加法变得微不足道,该单元禁止改进的移位器和数据传输,以进一步降低功耗。 该单元然后执行多次操作。

    APPROACH FOR EFFICIENT ARITHMETIC OPERATIONS
    4.
    发明申请
    APPROACH FOR EFFICIENT ARITHMETIC OPERATIONS 审中-公开
    有效的算术运算方法

    公开(公告)号:US20140129807A1

    公开(公告)日:2014-05-08

    申请号:US13671485

    申请日:2012-11-07

    Abstract: A system and method are described for providing hints to a processing unit that subsequent operations are likely. Responsively, the processing unit takes steps to prepare for the likely subsequent operations. Where the hints are more likely than not to be correct, the processing unit operates more efficiently. For example, in an embodiment, the processing unit consumes less power. In another embodiment, subsequent operations are performed more quickly because the processing unit is prepared to efficiently handle the subsequent operations.

    Abstract translation: 描述了一种系统和方法,用于向处理单元提供后续操作可能的提示。 响应地,处理单元采取步骤准备可能的后续操作。 在提示更有可能不正确的地方,处理单元更有效地运作。 例如,在一个实施例中,处理单元消耗较少的功率。 在另一个实施例中,由于处理单元被准备好以有效地处理随后的操作,更快地执行后续操作。

    EFFICIENCY THROUGH A DISTRIBUTED INSTRUCTION SET ARCHITECTURE
    5.
    发明申请
    EFFICIENCY THROUGH A DISTRIBUTED INSTRUCTION SET ARCHITECTURE 审中-公开
    通过分布式指令集架构实现高效

    公开(公告)号:US20150113254A1

    公开(公告)日:2015-04-23

    申请号:US14061666

    申请日:2013-10-23

    CPC classification number: G06F9/3836

    Abstract: A subsystem is configured to support a distributed instruction set architecture with primary and secondary execution pipelines. The primary execution pipeline supports the execution of a subset of instructions in the distributed instruction set architecture that are issued frequently. The secondary execution pipeline supports the execution of another subset of instructions in the distributed instruction set architecture that are issued less frequently. Both execution pipelines also support the execution of FFMA instructions as well a common subset of instructions in the distributed instruction set architecture. When dispatching a requested instruction, an instruction scheduling unit is configured to select between the two execution pipelines based on various criteria. Those criteria may include power efficiency with which the instruction can be executed and availability of execution units to support execution of the instruction.

    Abstract translation: 子系统被配置为支持具有主和辅助执行管线的分布式指令集体系结构。 主要执行流水线支持经常发布的分布式指令集架构中的指令子集的执行。 辅助执行流水线支持执行分布式指令集体系结构中不太频繁发布的指令的另一子集。 两个执行流水线也支持执行FFMA指令以及分布式指令集体系结构中的一个常见的指令子集。 当调度所请求的指令时,指令调度单元被配置为基于各种标准在两个执行流水线之间进行选择。 这些标准可以包括能够执行指令的功率效率和执行单元的可用性以支持指令的执行。

    FFMA OPERATIONS USING A MULTI-STEP APPROACH TO DATA SHIFTING
    6.
    发明申请
    FFMA OPERATIONS USING A MULTI-STEP APPROACH TO DATA SHIFTING 有权
    FFMA操作使用数据移位的多步法

    公开(公告)号:US20150039662A1

    公开(公告)日:2015-02-05

    申请号:US13959397

    申请日:2013-08-05

    CPC classification number: G06F5/012 G06F7/483 G06F7/5443

    Abstract: A fused floating-point multiply-add element includes a multiplier that generates a product, and a shifter that shifts an addend within a narrow range. Interpreting logic analyzes the magnitude of the addend relative to the product and then causes logic arrays to position the shifted addend within the left, center, or right portions of a composite register depending in the magnitude of the addend relative to the product. The interpreting logic also forces other portions of the composite register to zero. When the addend is zero, the interpreting logic forces all portions of the composite register to zero. Final combining logic then adds the contents of the composite register to the product.

    Abstract translation: 融合浮点乘法元素包括产生乘积的乘法器和用于移动窄范围内的加数的移位器。 解释逻辑分析加法相对于产品的大小,然后使逻辑阵列根据相对于产品的加数的大小,将移位的加数定位在复合寄存器的左,中,右部分内。 解释逻辑还强制复合寄存器的其他部分为零。 当加数为零时,解释逻辑强制复合寄存器的所有部分为零。 最终组合逻辑然后将复合寄存器的内容添加到产品中。

Patent Agency Ranking