Encoding densely packed decimals
    1.
    发明授权
    Encoding densely packed decimals 有权
    编码密集的小数

    公开(公告)号:US09128758B2

    公开(公告)日:2015-09-08

    申请号:US13296273

    申请日:2011-11-15

    IPC分类号: G06F7/491 G06F7/499

    摘要: According to one aspect of the present disclosure, a method and technique for encoding densely packed decimals is disclosed. The method includes: executing a floating point instruction configured to perform a floating point operation on decimal data in a binary coded decimal (BCD) format; determining whether a result of the operation includes a rounded mantissa overflow; and responsive to determining that the result of the operation includes a rounded mantissa overflow, compressing a result of the operation from the BCD-formatted decimal data to decimal data in a densely packed decimal (DPD) format by shifting select bit values of the BCD formatted decimal data by one digit to select bit positions in the DPD format.

    摘要翻译: 根据本公开的一个方面,公开了一种用于编码密集包装小数的方法和技术。 该方法包括:执行被配置为以二进制编码十进制(BCD)格式对十进制数据执行浮点运算的浮点指令; 确定所述操作的结果是否包括舍入尾数溢出; 并且响应于确定所述操作的结果包括四舍五入的尾数溢出,通过将所述BCD格式化的十进制数据的所述BCD格式的十进制数据的选择位值移位,以密集的十进制(DPD)格式将所述操作的结果压缩为十进制数据 十进制数据以一位数字选择DPD格式的位位置。

    SPLITABLE AND SCALABLE NORMALIZER FOR VECTOR DATA
    2.
    发明申请
    SPLITABLE AND SCALABLE NORMALIZER FOR VECTOR DATA 有权
    用于矢量数据的可分离和可扩展的标准化

    公开(公告)号:US20150067298A1

    公开(公告)日:2015-03-05

    申请号:US14016607

    申请日:2013-09-03

    IPC分类号: G06F15/78

    摘要: A hardware circuit component configured to support vector operations in a scalar data path. The hardware circuit component configured to operate in a vector mode configuration and in a scalar mode configuration. The hardware circuit component configured to split the scalar mode configuration into a left half and a right half of the vector mode configuration. The hardware circuit component configured to perform one or more bit shifts over one or more stages of interconnected multiplexers in the vector mode configuration. The hardware circuit component configured to include duplicated coarse shift multiplexers at bit positions that receive data from both the left half and the right half of the vector mode configuration, resulting in one or more coarse shift multiplexers sharing the bit position.

    摘要翻译: 配置为支持标量数据路径中的向量操作的硬件电路组件。 硬件电路组件被配置为以矢量模式配置和标量模式配置操作。 硬件电路组件被配置为将标量模式配置分解为向量模式配置的左半部分和右半部分。 硬件电路组件被配置为在矢量模式配置中的一个或多个互连多路复用器级上执行一个或多个位移位。 硬件电路组件被配置为在位向量模式配置的左半部分和右半部分接收数据的比特位置包括复制的粗略移位复用器,从而产生一个或多个共享比特位置的粗移位复用器。

    DYNAMIC HARDWARE TRACE SUPPORTING MULTIPHASE OPERATIONS
    3.
    发明申请
    DYNAMIC HARDWARE TRACE SUPPORTING MULTIPHASE OPERATIONS 有权
    动态硬件跟踪支持多种操作

    公开(公告)号:US20130339802A1

    公开(公告)日:2013-12-19

    申请号:US13525054

    申请日:2012-06-15

    IPC分类号: G06F11/34

    摘要: A method and system for tracing in a data processing system. The method includes receiving a plurality of signals associated with an operation during execution of the operation. The method also includes, in response to an indication that the operation is a multiphase operation, during execution of the operation, selection logic, during a first phase of the multiphase operation, selecting and outputting as a trace signal a first signal of the plurality of signals, and during a second phase of the multiphase operation, selecting and outputting as the trace signal a second signal of the plurality of signals.

    摘要翻译: 一种用于在数据处理系统中进行跟踪的方法和系统。 该方法包括在执行操作期间接收与操作相关联的多个信号。 该方法还包括响应于操作是多相操作的指示,在执行操作期间,选择逻辑在多相操作的第一阶段期间,选择和输出作为跟踪信号的多个 信号,并且在所述多相操作的第二阶段期间,选择并输出所述多个信号的第二信号作为所述跟踪信号。

    RESIDUE-BASED EXPONENT FLOW CHECKING
    4.
    发明申请
    RESIDUE-BASED EXPONENT FLOW CHECKING 审中-公开
    基于残留的现金流量检查

    公开(公告)号:US20130339417A1

    公开(公告)日:2013-12-19

    申请号:US13517839

    申请日:2012-06-14

    IPC分类号: G06F7/38

    CPC分类号: G06F7/72 G06F7/483

    摘要: A technique for checking an exponent calculation for an execution unit that supports floating point operations includes generating, using a residue prediction circuit, a predicted exponent residue for a result exponent of a floating point operation. The technique also includes generating, using an exponent calculation circuit, the result exponent for the floating point operation and generating, using the residue prediction circuit, a result exponent residue for the result exponent. Finally, the technique includes comparing the predicted exponent residue to the result exponent residue to determine whether the result exponent generated by the exponent calculation circuit is correct and, if not, signaling an error.

    摘要翻译: 一种用于检查支持浮点运算的执行单元的指数计算的技术包括使用残差预测电路产生用于浮点运算的结果指数的预测指数残差。 该技术还包括使用指数计算电路产生浮点运算的结果指数,并使用残差预测电路产生结果指数的结果指数残差。 最后,该技术包括将预测指数残差与结果指数残差进行比较,以确定由指数计算电路产生的结果指数是否正确,如果不是,则发出错误信号。

    Decimal adder with end around carry
    5.
    发明授权
    Decimal adder with end around carry 失效
    十进制加法器结束周围进位

    公开(公告)号:US08554822B2

    公开(公告)日:2013-10-08

    申请号:US12822919

    申请日:2010-06-24

    IPC分类号: G06F7/494

    CPC分类号: G06F7/494 G06F7/508

    摘要: Binary code decimal (BCD) arithmetic add/subtract operations on two BCD numbers independent of which BCD number is of a greater magnitude include, responsive to the BCD arithmetic add/subtract operation being a subtract operation, performing a BCD arithmetic subtraction operation on a first BCD number and a second BCD number, the first BCD number having a first magnitude and the second BCD number having a second magnitude. The first magnitude is greater than, equal to, or less than the second magnitude. The performing includes: in parallel to a carry generation, partial sums or partial differences of the first and second BCD numbers are computer such that a final result in signed magnitude form is selectable from the partial sums or differences based on carry information without any post processing steps.

    摘要翻译: 对于与BCD数量不同的两个BCD号码的二进制码十进制(BCD)算术加法/减法操作包括响应于BCD算术加/减操作作为减法运算,对第一个BCD运算执行BCD运算减法运算 BCD号和第二BCD号,第一BCD号具有第一幅值,第二BCD号具有第二幅值。 第一幅度大于等于或小于第二幅度。 执行包括:与进位生成并行,第一和第二BCD号码的部分和或部分差异是计算机,使得基于携带信息的部分和或差异可以从签名幅度形式中选择最终结果,而不进行任何后处理 脚步。

    REDUCING ISSUE-TO-ISSUE LATENCY BY REVERSING PROCESSING ORDER IN HALF-PUMPED SIMD EXECUTION UNITS
    6.
    发明申请
    REDUCING ISSUE-TO-ISSUE LATENCY BY REVERSING PROCESSING ORDER IN HALF-PUMPED SIMD EXECUTION UNITS 有权
    通过在半导体SIMD执行单元中反转加工订单减少发行问题的延迟

    公开(公告)号:US20130159666A1

    公开(公告)日:2013-06-20

    申请号:US13326249

    申请日:2011-12-14

    摘要: Techniques for reducing issue-to-issue latency by reversing processing order in half-pumped single instruction multiple data (SIMD) execution units are described. In one embodiment a processor functional unit is provided comprising a frontend unit, and execution core unit, a backend unit, an execution order control signal unit, a first interconnect coupled between and output and an input of the execution core unit and a second interconnect coupled between an output of the backend unit and an input of the frontend unit. In operation, the execution order control signal unit generates a forwarding order control signal based on the parity of an applied clock signal on reception of a first vector instruction. This control signal is in turn used to selectively forward first and second portions of an execution result of the first vector instruction via the interconnects for use in the execution of a dependent second vector instruction.

    摘要翻译: 描述了通过反转半抽头单指令多数据(SIMD)执行单元中的处理顺序来减少发出问题的延迟的技术。 在一个实施例中,提供了一种处理器功能单元,其包括前端单元和执行核心单元,后端单元,执行顺序控制信号单元,耦合在其中并且输出之间的第一互连和执行核心单元的输入以及耦合的第二互连 在后端单元的输出和前端单元的输入之间。 在操作中,执行顺序控制信号单元在接收到第一向量指令时基于所施加的时钟信号的奇偶校验产生转发顺序控制信号。 该控制信号又用于经由互连选择性地转发第一向量指令的执行结果的第一和第二部分,以用于依赖的第二向量指令的执行。

    Fast floating point compare with slower backup for corner cases
    7.
    发明授权
    Fast floating point compare with slower backup for corner cases 有权
    快速浮点与较慢的备份角落比较

    公开(公告)号:US08407275B2

    公开(公告)日:2013-03-26

    申请号:US12255968

    申请日:2008-10-22

    IPC分类号: G06F7/02

    CPC分类号: G06F9/30021 G06F9/30025

    摘要: A floating point processor unit executes a floating point compare instruction with two operands of the same or different precision by comparing the two operands in integer format, which speeds up the execution of the floating point compare instruction significantly. The floating point processor now executes the floating point compare instruction at least twice as fast or faster (e.g., two clock cycles instead of five clock cycles in the prior art) for nearly most operand cases (e.g., 99% of all cases). Only the rare corner cases require additional operations on one of the operands and thus require additional cycles of execution time because the integer compare operation will not work for these corner cases. This is due to the fact that one operand is a single precision subnormal number in an unnormalized representation (i.e., has two representations) and the other operand is in the SP subnormal range such that the integer compare operation will fail.

    摘要翻译: 浮点处理器单元通过比较整数格式的两个操作数来执行具有相同或不同精度的两个操作数的浮点比较指令,这显着地加快了浮点比较指令的执行。 浮点处理器现在对于几乎大多数操作数情况(例如,所有情况的99%),至少执行两倍快或更快(例如,现有技术中的两个时钟周期而不是五个时钟周期)的浮点比较指令。 只有罕见的角落情况需要在其中一个操作数上进行额外的操作,因此需要额外的执行周期,因为整数比较操作将不适用于这些角色。 这是由于一个操作数是非正规化表示中的单精度子正规数(即,具有两个表示),另一个操作数处于SP子正常范围,使得整数比较操作将失败。

    METHODS FOR CONFLICT-FREE, COOPERATIVE EXECUTION OF COMPUTATIONAL PRIMITIVES ON MULTIPLE EXECUTION UNITS
    8.
    发明申请
    METHODS FOR CONFLICT-FREE, COOPERATIVE EXECUTION OF COMPUTATIONAL PRIMITIVES ON MULTIPLE EXECUTION UNITS 失效
    无冲突的方法,多个执行单位的计算原则的合作执行

    公开(公告)号:US20090198974A1

    公开(公告)日:2009-08-06

    申请号:US12023432

    申请日:2008-01-31

    IPC分类号: G06F9/44 G06F17/11

    摘要: A method for executing multiple computational primitives is provided in accordance with exemplary embodiments. A first computational unit and at least a second computational unit cooperate to execute multiple computational primitives. The first computational unit independently computes other computational primitives. By virtue of arbitration for shared source operand buses or shared result buses, availability of the first and second computational units needed to execute cooperatively the multiple computational primitives is assured by a process of reservation as used for a computational primitive executed on a dedicated computational unit.

    摘要翻译: 根据示例性实施例提供了一种用于执行多个计算原语的方法。 第一计算单元和至少第二计算单元合作执行多个计算原语。 第一计算单元独立计算其他计算原语。 通过对共享源操作数总线或共享结果总线的仲裁,通过协作地执行多个计算原语所需的第一和第二计算单元的可用性通过用于在专用计算单元上执行的计算原语的预留处理来确保。

    Zero indication forwarding for floating point unit power reduction
    10.
    发明授权
    Zero indication forwarding for floating point unit power reduction 失效
    用于浮点单元功率降低的零指示转发

    公开(公告)号:US08578196B2

    公开(公告)日:2013-11-05

    申请号:US13552327

    申请日:2012-07-18

    IPC分类号: G06F1/00

    摘要: A method and system for reducing power consumption when processing mathematical operations. Power may be reduced in processor hardware devices that receive one or more operands from an execution unit that executes instructions. A circuit detects when at least one operand of multiple operands is a zero operand, prior to the operand being forwarded to an execution component for completing a mathematical operation. When at least one operand is a zero operand or at least one operand is “unordered”, a flag is set that triggers a gating of a clock signal. The gating of the clock signal disables one or more processing stages and/or devices, which perform the mathematical operation. Disabling the stages and/or devices enables computing the correct result of the mathematical operation on a reduced data path. When a device(s) is disabled, the device may be powered off until the device is again required by subsequent operations.

    摘要翻译: 一种在处理数学运算时降低功耗的方法和系统。 在从执行指令的执行单元接收一个或多个操作数的处理器硬件设备中,功率可能会降低。 在将操作数转发到执行组件以完成数学运算之前,电路检测多个操作数的至少一个操作数是否为零操作数。 当至少一个操作数为零操作数或至少一个操作数为“无序”时,设置触发时钟信号选通的标志。 时钟信号的门控禁用执行数学运算的一个或多个处理级和/或器件。 禁用级和/或设备可以在减少的数据路径上计算数学运算的正确结果。 当设备被禁用时,可能会关闭设备电源,直到后续操作再次要求设备。