Result normalizer and method of operation
    1.
    发明授权
    Result normalizer and method of operation 失效
    结果规范和操作方法

    公开(公告)号:US5392228A

    公开(公告)日:1995-02-21

    申请号:US161361

    申请日:1993-12-06

    CPC分类号: G06F7/485 G06F5/012

    摘要: A result normalizer (58) for use with an adder (56) generates a mask in two stages that indicates the location of the leading one in the adder result. In the first stage, a leading zero anticipator (68) determines the position to within two digits. In the second stage, a count leading zero indicator (70) determines the position to a single digit. The mask is used to control the number of digits that each stage of a multiplexer array (66) shifts the adder result. The output of the multiplexer array thereby contains a leading one. The result normalizer may be advantageously used in high performance applications such as in a floating point execution unit in a data processor or in digital signal processing systems.

    摘要翻译: 与加法器(56)一起使用的结果归一化器(58)在两个阶段中生成指示加法器结果中前导序列的位置的掩码。 在第一阶段,领先的零预测者(68)将位置确定在两位数之内。 在第二阶段中,计数前导零指示符(70)确定位置到单个数字。 该掩码用于控制多路复用器阵列(66)的每个级移位加法器结果的位数。 因此,多路复用器阵列的输出包含一个前导的。 结果归一化器可以有利地用于高性能应用中,例如在数据处理器或数字信号处理系统中的浮点执行单元中。

    Floating-point processor having post-writeback spill stage
    2.
    发明授权
    Floating-point processor having post-writeback spill stage 失效
    浮点处理器具有回写后溢出阶段

    公开(公告)号:US5583805A

    公开(公告)日:1996-12-10

    申请号:US352661

    申请日:1994-12-09

    IPC分类号: G06F7/57 G06F7/38

    CPC分类号: G06F7/483 G06F7/49915

    摘要: An apparatus for handling special cases outside of normal floating-point arithmetic functions is provided that is used in a floating-point unit used for calculating arithmetic functions. The floating-point unit generates an exponent portion and a mantissa portion and a writeback stage is coupled to the exponent portion and to the mantissa portion and is specifically used to handle the special cases outside the normal float arithmetic functions. A spill stage is also provided and is coupled to the writeback stage to receive a resultant exponent and mantissa. A register file unit is coupled to the writeback stage and the spill stage through a plurality of rename busses, which are used to carry results between the writeback stage and spill stage and the register file. The spill stage is serially coupled to the writeback stage so as to provide a smooth operation in the transition of operating on the results from the writeback stage for the exponent and mantissa. Each rename bus has a pair of tri-state buffers, one used to couple the rename bus to the writeback stage and the other used to couple the rename bus to the spill stage. The instruction dispatcher also provides location information for directing the results from the writeback stage and the spill stage before the result is completed.

    摘要翻译: 提供了用于处理正常浮点运算功能之外的特殊情况的装置,用于计算算术功能的浮点单元。 浮点单元产生指数部分和尾数部分,并且回写阶段耦合到指数部分和尾数部分,并且专门用于处理普通浮点运算功能之外的特殊情况。 还提供溢出阶段并且耦合到回写阶段以接收所得到的指数和尾数。 寄存器文件单元通过多个重命名总线耦合到回写阶段和溢出阶段,这些总线用于在回写阶段和溢出阶段之间携带结果和寄存器文件。 溢出级串联耦合到回写阶段,以便在针对指数和尾数的回写阶段的结果的转换中提供平滑的操作。 每个重命名总线都有一对三态缓冲器,一个用于将重命名总线耦合到回写阶段,另一个用于将重命名总线耦合到溢出级。 指令调度器还提供位置信息,用于在结果完成之前从写回阶段和溢出阶段引导结果。

    Method and system for high speed floating point exception enabled
operation in a multiscalar processor system
    3.
    发明授权
    Method and system for high speed floating point exception enabled operation in a multiscalar processor system 失效
    用于多速度处理器系统中高速浮点异常使能操作的方法和系统

    公开(公告)号:US5410657A

    公开(公告)日:1995-04-25

    申请号:US959193

    申请日:1992-10-09

    摘要: A method and system are disclosed for implementing floating point exception enabled operation without substantial performance degradation. In a multiscalar processor system, multiple instructions may be issued and executed simultaneously utilizing multiple independent functional units. This is typically accomplished utilizing separate branch, fixed point and floating point processor units. Floating point arithmetic instructions within the floating point processor unit may initiate one of a variety of exceptions associated within invalid operations and as a result of the pipelined nature of floating point processor units an identification of which instruction initiated the exception is not possible. In the described method and system, an associated dummy instruction having a retained instruction address is dispatched to the fixed point processor unit each time a floating point arithmetic instruction is dispatched to the floating point processor unit. Thereafter, the output of each instruction from the floating point processor unit is synchronized with an output of an associated dummy instruction wherein each instruction within the floating point processor unit which initiates a floating point exception may be accurately identified utilizing the retained instruction address of the associated dummy instruction.

    摘要翻译: 公开了一种用于实现浮点异常启用操作而不会显着降低性能的方法和系统。 在多级数据处理器系统中,可以使用多个独立功能单元同时发出并执行多个指令。 这通常使用单独的分支,固定点和浮点处理器单元来完成。 浮点处理器单元内的浮点运算指令可以启动与无效操作相关联的各种异常之一,并且由于浮点处理器单元的流水线性质的结果,引发异常的指令是不可能的。 在所描述的方法和系统中,每当向浮点处理器单元调度浮点算术指令时,将具有保留指令地址的相关联的伪指令分派到定点处理器单元。 此后,来自浮点处理器单元的每个指令的输出与相关联的虚拟指令的输出同步,其中可以使用所关联的虚拟指令的保留指令地址来准确地识别启动浮点异常的浮点处理器单元内的每个指令 虚拟指令。

    Efficient floating point overflow and underflow detection system
    4.
    发明授权
    Efficient floating point overflow and underflow detection system 失效
    高效浮点溢出和下溢检测系统

    公开(公告)号:US5553015A

    公开(公告)日:1996-09-03

    申请号:US228480

    申请日:1994-04-15

    CPC分类号: G06F7/483 G06F7/4991

    摘要: A processing system that determines whether an underflow or overflow condition has occurred concurrently with the determination of the floating point exponent result uses a group of latched constants which can be added to the intermediate exponent and the exponent adjust to determine out of range conditions for all cases. The appropriate one of these latched constants (exponent range check values; exp.sub.-- range.sub.-- chk) are added to the exp.sub.-- int and exp.sub.-- adjust to give a value that will vary based on whether the exp.sub.-- result is out of range, or not. Different exp.sub.-- range.sub.-- chk values are used for underflow single precision, underflow double precision, overflow single precision and overflow double precision. The sum of these three values (exp.sub.-- int, exp.sub.-- adj, exp.sub.-- range.sub.-- chk) will yield a binary number having a most significant bit (MSB) that is dependent upon the exp.sub.-- result value. More particularly, the MSB will be a logical 1 when an out of range condition has occurred and a logical 0 for normal in range exponent results.

    摘要翻译: 确定下溢或溢出条件是否与浮点指数结果的确定同时发生的处理系统使用一组可被添加到中间指数的锁存常数,并且指数调整以确定所有情况的超出范围条件 。 这些被锁存的常数(指数范围检查值; exp-range-chk)中适当的一个被添加到exp-int和exp-adjust中,以给出将根据exp-result超出范围而变化的值,或者 不。 不同的exp-range-chk值用于下溢单精度,下溢双精度,溢出单精度和溢出双精度。 这三个值(exp-int,exp-adj,exp-range-chk)的和将产生一个具有最高有效位(MSB)的二进制数,这取决于exp结果值。 更具体地说,当超出范围条件已经发生时,MSB将是逻辑1,并且在正常范围指数结果中为逻辑0。

    Accessing a multibank register file using a thread identifier
    5.
    发明授权
    Accessing a multibank register file using a thread identifier 有权
    使用线程标识符访问多银行寄存器文件

    公开(公告)号:US08458446B2

    公开(公告)日:2013-06-04

    申请号:US12570682

    申请日:2009-09-30

    IPC分类号: G06F9/30

    摘要: A processor includes an instruction fetch unit configured to issue instructions for execution, where the instructions are selected from a number of threads, where each given instruction has a corresponding thread identifier, and where at least some of the instructions specify operand(s) via register identifiers. A register file stores operands usable by the instructions, and may include several banks, each corresponding to a register identifiers and including several entries corresponding to the several threads, wherein the entries are configured to store data values. In response to receiving a request to read a particular register identifier for a given thread identifier, the register file may be configured to decode the given thread identifier to retrieve entries from the banks that correspond to the given thread identifier. The register file may further select, from among the retrieved entries, a data value corresponding to the particular register identifier to be output.

    摘要翻译: 处理器包括:指令获取单元,被配置为发出用于执行的指令,其中从多个线程中选择指令,其中每个给定指令具有对应的线程标识符,并且其中至少一些指令经由寄存器指定操作数 身份标识。 寄存器文件存储指令可用的操作数,并且可以包括几个存储体,每个存储体对应于寄存器标识符,并且包括与多个线程对应的多个条目,其中条目被配置为存储数据值。 响应于接收到针对给定线程标识符读取特定寄存器标识符的请求,寄存器文件可以被配置为对给定的线程标识符进行解码以从对应于给定线程标识符的存储体检索条目。 寄存器文件还可以从检索到的条目中选择与要输出的特定寄存器标识符对应的数据值。

    Processor which implements fused and unfused multiply-add instructions in a pipelined manner
    6.
    发明授权
    Processor which implements fused and unfused multiply-add instructions in a pipelined manner 有权
    处理器,以流水线方式实现融合和未分配的加法指令

    公开(公告)号:US08239440B2

    公开(公告)日:2012-08-07

    申请号:US12057894

    申请日:2008-03-28

    IPC分类号: G06F7/38

    摘要: Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.

    摘要翻译: 在融合的乘法加法管道中实现未经加密的乘法加法指令。 系统可以包括具有用于接收加法项的输入的对准器,具有用于接收第一值的两个输入和用于乘法的第二值的乘法器树,以及第一进位保存加法器(CSA),其中第一CSA可以接收部分 乘数树中的乘积和对准器的对齐加法项。 该系统可以包括可以接收第一部分乘积,第二部分乘积和对齐的加法项的融合/未融合乘法(FUMA)块,其中第一部分乘积和第二部分乘积不被截断。 FUMA块可以使用第一部分乘积,第二部分积和对齐的相加项来执行未融合的加法运算或融合乘法运算,例如取决于操作码或模式位。

    EXECUTION UNIT FOR PERFORMING THE DATA ENCRYPTION STANDARD
    7.
    发明申请
    EXECUTION UNIT FOR PERFORMING THE DATA ENCRYPTION STANDARD 有权
    执行数据加密标准的执行单位

    公开(公告)号:US20120087492A1

    公开(公告)日:2012-04-12

    申请号:US13291026

    申请日:2011-11-07

    IPC分类号: H04L9/00

    CPC分类号: H04L9/0625 H04L2209/12

    摘要: Described is an execution unit for performing at least part of the Data Encryption Standard that includes a Left Half input; a Key input; and a Table input, as well as a first group of transistors configured to receive the Table input, perform a table look-up, and output data. The execution unit further includes a first exclusive-or operator having two inputs and an output that is configured to receive the Left Half input and the Key input. The execution unit also includes a second exclusive-or operator having two inputs and an output that is configured to receive the data output by the first group of transistors and to receive the output of the first exclusive-or operator. The execution unit also includes a third exclusive-or operator having two inputs and an output that is configured to receive the Left Half input and the data output by the first group of transistors.

    摘要翻译: 描述了用于执行包括左半输入的数据加密标准的至少一部分的执行单元; 一键输入 和Table输入,以及被配置为接收Table输入的第一组晶体管,执行表查找和输出数据。 执行单元还包括具有两个输入的第一异或运算符和被配置为接收左半输入和键输入的输出。 执行单元还包括具有两个输入的第二异或运算符和被配置为接收由第一组晶体管输出的数据并且接收第一个异或运算符的输出的输出。 执行单元还包括具有两个输入的第三异或运算符和被配置为接收左半输入和由第一组晶体管输出的数据的输出。

    Register error correction of speculative data in an out-of-order processor
    8.
    发明授权
    Register error correction of speculative data in an out-of-order processor 有权
    在乱序处理器中注册误差校正数据

    公开(公告)号:US08078942B2

    公开(公告)日:2011-12-13

    申请号:US11849749

    申请日:2007-09-04

    IPC分类号: G11C29/00 H03M13/00

    CPC分类号: G06F11/10

    摘要: In one embodiment, a processor comprises a first register file configured to store speculative register state, a second register file configured to store committed register state, a check circuit and a control unit. The first register file is protected by a first error protection scheme and the second register file is protected by a second error protection scheme. A check circuit is coupled to receive a value and corresponding one or more check bits read from the first register file to be committed to the second register file in response to the processor selecting a first instruction to be committed. The check circuit is configured to detect an error in the value responsive to the value and the check bits. Coupled to the check circuit, the control unit is configured to cause reexecution of the first instruction responsive to the error detected by the check circuit.

    摘要翻译: 在一个实施例中,处理器包括被配置为存储推测寄存器状态的第一寄存器文件,被配置为存储提交寄存器状态的第二寄存器文件,检查电路和控制单元。 第一个寄存器文件由第一个错误保护方案保护,第二个寄存器文件由第二个错误保护方案保护。 耦合检查电路以响应于处理器选择要提交的第一指令,接收从第一寄存器文件读取的值和对应的一个或多个校验位以提交给第二寄存器堆。 检查电路被配置为响应于该值和校验位来检测该值中的错误。 耦合到检查电路,控制单元被配置为响应于由检查电路检测到的错误而引起第一指令的再次执行。

    PROCESSOR AND METHOD PROVIDING INSTRUCTION SUPPORT FOR INSTRUCTIONS THAT UTILIZE MULTIPLE REGISTER WINDOWS
    9.
    发明申请
    PROCESSOR AND METHOD PROVIDING INSTRUCTION SUPPORT FOR INSTRUCTIONS THAT UTILIZE MULTIPLE REGISTER WINDOWS 有权
    处理器和方法提供指令支持使用多个寄存器窗口的指令

    公开(公告)号:US20110296142A1

    公开(公告)日:2011-12-01

    申请号:US12790074

    申请日:2010-05-28

    IPC分类号: G06F9/30 G06F9/315 G06F9/312

    摘要: A processor including instruction support for large-operand instructions that use multiple register windows may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may also include an instruction execution unit that, during operation, receives instructions for execution from the instruction fetch unit and executes a large-operand instruction defined within the ISA, where execution of the large-operand instruction is dependent upon a plurality of registers arranged within a plurality of register windows. The processor may further include control circuitry (which may be included within the fetch unit, the execution unit, or elsewhere within the processor) that determines whether one or more of the register windows depended upon by the large-operand instruction are not present. In response to determining that one or more of these register windows are not present, the control circuitry causes them to be restored.

    摘要翻译: 包括对使用多个寄存器窗口的大操作数指令的指令支持的处理器可以从定义的指令集架构(ISA)发出用于执行编程器可选择指令的指令。 处理器还可以包括指令执行单元,其在操作期间从指令获取单元接收执行指令,并执行在ISA内定义的大操作数指令,其中大操作数指令的执行取决于多个寄存器 布置在多个寄存器窗口内。 处理器还可以包括控制电路(其可以包括在提取单元,执行单元或处理器内的其他地方),其确定不存在大操作数指令所依赖的寄存器窗口中的一个或多个。 响应于确定这些寄存器窗口中的一个或多个不存在,控制电路使它们被恢复。

    THREAD FAIRNESS ON A MULTI-THREADED PROCESSOR WITH MULTI-CYCLE CRYPTOGRAPHIC OPERATIONS
    10.
    发明申请
    THREAD FAIRNESS ON A MULTI-THREADED PROCESSOR WITH MULTI-CYCLE CRYPTOGRAPHIC OPERATIONS 有权
    具有多周期运行的多线程处理器的螺纹公差

    公开(公告)号:US20110276783A1

    公开(公告)日:2011-11-10

    申请号:US12773278

    申请日:2010-05-04

    IPC分类号: G06F9/38

    摘要: Systems and methods for efficient execution of operations in a multi-threaded processor. Each thread may include a blocking instruction. A blocking instruction blocks other threads from utilizing hardware resources for an appreciable amount of time. One example of a blocking type instruction is a Montgomery multiplication cryptographic instruction. Each thread can operate in a thread-based mode that allows the insertion of stall cycles during the execution of blocking instructions, during which other threads may utilize the previously blocked hardware resources. At times when multiple threads are scheduled to execute blocking instructions, the thread-based mode may be changed to increase throughput for these multiple threads. For example, the mode may be changed to disallow the insertion of stall cycles. Therefore, the time for sequential operation of the blocking instructions corresponding to the multiple threads may be reduced.

    摘要翻译: 在多线程处理器中有效执行操作的系统和方法。 每个线程可以包括阻塞指令。 阻塞指令阻止其他线程在相当长的时间内利用硬件资源。 阻塞型指令的一个例子是蒙哥马利乘法加密指令。 每个线程都可以以线程为基础的模式运行,允许在执行阻塞指令期间插入停滞周期,在此期间其他线程可能利用先前阻止的硬件资源。 在多个线程被调度执行阻塞指令的时候,可以改变基于线程的模式,以增加这些多线程的吞吐量。 例如,可以改变该模式以不允许插入失速循环。 因此,可以减少对应于多个线程的阻塞指令的顺序操作的时间。