Efficient parallel floating point exception handling in a processor
    41.
    发明申请
    Efficient parallel floating point exception handling in a processor 有权
    处理器中的高效并行浮点异常处理

    公开(公告)号:US20090327665A1

    公开(公告)日:2009-12-31

    申请号:US12217084

    申请日:2008-06-30

    IPC分类号: G06F9/302

    摘要: Methods and apparatus are disclosed for handling floating point exceptions in a processor that executes single-instruction multiple-data (SIMD) instructions. In one embodiment a numerical exception is identified for a SIMD floating point operation and SIMD micro-operations are initiated to generate two packed partial results of a packed result for the SIMD floating point operation. A SIMD denormalization micro-operation is initiated to combine the two packed partial results and to denormalize one or more elements of the combined packed partial results to generate a packed result for the SIMD floating point operation having one or more denormal elements. Flags are set and stored with packed partial results to identify denormal elements. In one embodiment a SIMD normalization micro-operation is initiated to generate a normalized pseudo internal floating point representation prior to the SIMD floating point operation when it uses multiplication.

    摘要翻译: 公开了用于处理执行单指令多数据(SIMD)指令的处理器中的浮点异常的方法和装置。 在一个实施例中,识别用于SIMD浮点运算的数字异常,并启动SIMD微操作以产生用于SIMD浮点运算的打包结果的两个打包部分结果。 启动SIMD非规范化微操作以组合两个打包的部分结果并且对组合的打包部分结果的一个或多个元素进行非规范化,以生成具有一个或多个异常元素的SIMD浮点运算的打包结果。 标志被设置和存储与打包部分结果以识别异常元素。 在一个实施例中,当SIMD标准化微操作在使用乘法时在SIMD浮点运算之前产生归一化的伪内部浮点表示。

    Fused multiply add operations using bit masks
    43.
    发明授权
    Fused multiply add operations using bit masks 有权
    融合乘法使用位掩码添加操作

    公开(公告)号:US09542154B2

    公开(公告)日:2017-01-10

    申请号:US13926175

    申请日:2013-06-25

    IPC分类号: G06F7/483 G06F7/544 G06F7/76

    摘要: Systems and methods of performing a fused multiply add (FMA) operations are provided. In one embodiment, the length of the adder used by the FMA operation is less than 3*N, where N is the number of bits in the mantissa term of a floating point number. A mask may be used to perform the addition portion of the FMA operation using the adder. A second mask may be used to denormalize the result of the addition portion of the FMA operation if an underflow occurs.

    摘要翻译: 提供了执行融合乘法(FMA)操作的系统和方法。 在一个实施例中,由FMA操作使用的加法器的长度小于3 * N,其中N是浮点数的尾数项中的位数。 可以使用掩码来使用加法器来执行FMA操作的相加部分。 如果发生下溢,则可以使用第二掩模来对FMA操作的添加部分的结果进行非规范化。

    GENERATING AND PERFORMING DEPENDENCY CONTROLLED FLOW COMPRISING MULTIPLE MICRO-OPERATIONS (uops)
    47.
    发明申请
    GENERATING AND PERFORMING DEPENDENCY CONTROLLED FLOW COMPRISING MULTIPLE MICRO-OPERATIONS (uops) 审中-公开
    生成和执行包含多个微操作的依赖性控制流(uop)

    公开(公告)号:US20090327657A1

    公开(公告)日:2009-12-31

    申请号:US12146390

    申请日:2008-06-25

    IPC分类号: G06F9/22

    摘要: A processor to perform an out-of-order (OOO) processing in which a reservation station (RS) may generate and process a dependency controlled flow comprising multiple micro-operations (uops) with specific clock based dispatch scheme. The RS may either combine two or more uops into a single RS entry or make a direct connection between two or more RS entries. The RS may allow more than two source values to be associated with a single RS by combining sources from the two or more uops. One or more execution units may be provisioned to perform the function defined by the uops. The execution units may receive more than two sources at a given time point and produce two or more results on different ports.

    摘要翻译: 执行无序(OOO)处理的处理器,其中保留站(RS)可以生成并处理包括具有特定的基于时钟的调度方案的多个微操作(uop)的依赖性控制流。 RS可以将两个或更多个uops组合成单个RS条目,或者在两个或更多个RS条目之间建立直接连接。 RS可以通过组合来自两个或更多个uops的源来允许多于两个源值与单个RS相关联。 可以提供一个或多个执行单元来执行由uops定义的功能。 执行单元可以在给定的时间点接收多于两个的源,并且在不同端口上产生两个或更多个结果。

    Performing reciprocal instructions with high accuracy
    48.
    发明授权
    Performing reciprocal instructions with high accuracy 有权
    以高精度执行相互指令

    公开(公告)号:US08706789B2

    公开(公告)日:2014-04-22

    申请号:US12976359

    申请日:2010-12-22

    IPC分类号: G06F7/38

    摘要: In one embodiment, the present invention includes a method for receiving a reciprocal instruction and an operand in a processor, accessing an entry of a lookup table based on a portion of the operand and the instruction, generating an encoder output based on a type of the reciprocal instruction and whether the reciprocal instruction is a legacy instruction, and selecting portions of the lookup table entry and input operand to be provided to a reciprocal logic unit based on the encoder output. Other embodiments are described and claimed.

    摘要翻译: 在一个实施例中,本发明包括一种用于在处理器中接收互逆指令和操作数的方法,其基于所述操作数和所述指令的一部分访问查找表的条目,基于所述操作数的类型生成编码器输出 互逆指令以及互易指令是否是遗留指令,以及基于编码器输出来选择要提供给倒数逻辑单元的查找表项和输入操作数的部分。 描述和要求保护其他实施例。