Method and system for fast determination of sticky and guard bits
    1.
    发明授权
    Method and system for fast determination of sticky and guard bits 失效
    用于快速测定粘性和保护位的方法和系统

    公开(公告)号:US5805487A

    公开(公告)日:1998-09-08

    申请号:US677843

    申请日:1996-07-12

    摘要: A method and system for fast calculation of the sticky bit and a function of the guard bit is disclosed. A first aspect of the method and system provides a fast calculation of the sticky bit. A second aspect provides a fast calculation of a function of the guard bit. Both aspects comprise means for providing an intermediate result of a floating point mathematical operation involving at least a first and a second operand and means for providing a mask indicating a position of a leading one in a mantissa of the intermediate result. In the first aspect, means for aligning a first bit of the mask to an (n+2)nd bit of the intermediate result, where n is the number of bits in a mantissa of the first or second operand, are coupled to the intermediate result providing means. In the second aspect, means for aligning a first bit of the mask to an (n+1)st bit of the intermediate result are coupled to the intermediate result providing means. In both aspects, means for providing an output are coupled to the aligning means and intermediate result providing means. The output of the first aspect comprises the sticky bit. The output of the second aspect comprises a function of the guard bit. Thus, the method and system allow the sticky bit and a function of the guard bit to be calculated substantially simultaneously with normalization. Because the method and system allow fast determination of the sticky bit and a function of the guard bit, the overall speed of the calculation is increased and system performance is improved.

    摘要翻译: 公开了一种用于快速计算粘滞位和保护位功能的方法和系统。 该方法和系统的第一方面提供了粘性位的快速计算。 第二方面提供了对保护位的功能的快速计算。 两个方面包括用于提供涉及至少第一和第二操作数的浮点数学运算的中间结果的装置,以及用于提供指示中间结果的尾数中的前导位置的掩码的装置。 在第一方面,用于将掩模的第一位与中间结果的第(n + 2)位对齐的装置,其中n是第一或第二操作数的尾数中的位数, 结果提供手段。 在第二方面,用于将掩模的第一位与中间结果的第(n + 1)位进行对准的装置耦合到中间结果提供装置。 在两个方面,用于提供输出的装置耦合到对准装置和中间结果提供装置。 第一方面的输出包括粘点。 第二方面的输出包括保护位的功能。 因此,该方法和系统允许基本上与归一化同时计算粘滞位和保护位的功能。 由于方法和系统允许快速确定粘滞位和保护位的功能,所以计算的总速度提高,系统性能得到提高。

    Method and system for performing a high speed floating point add
operation
    2.
    发明授权
    Method and system for performing a high speed floating point add operation 失效
    执行高速浮点加法运算的方法和系统

    公开(公告)号:US5790445A

    公开(公告)日:1998-08-04

    申请号:US641307

    申请日:1996-04-30

    CPC分类号: G06F7/485 G06F5/012

    摘要: A system and method for calculating a floating point add/subtract of a plurality of floating point operands is disclosed. The system comprises at least one pair of data paths. Each pair of data paths comprises a first data path and a second data path. The first data path includes a first aligner, a first adder coupled to the first aligner, and a first normalizer coupled to the first adder. The first normalizer is capable of shifting a mantissa by a substantially smaller number of digits than the first aligner. The second data path comprises control logic, a second aligner coupled to the control logic, a second adder coupled to the second aligner, and a second normalizer coupled to the second adder. The control logic provides a control signal that is responsive to a first predetermined number of digits of each exponent of a pair of exponents. The pair of exponents are the exponents for a pair of inputs to the second data path. The second aligner is responsive to the control signal provided by the control logic. In addition, the second normalizer is capable of shifting a mantissa by a substantially larger number of digits than the second aligner.

    摘要翻译: 公开了一种用于计算多个浮点操作数的浮点加法/减法的系统和方法。 该系统包括至少一对数据路径。 每对数据路径包括第一数据路径和第二数据路径。 第一数据路径包括第一对准器,耦合到第一对准器的第一加法器和耦合到第一加法器的第一归一化器。 第一标准器能够将尾数移位比第一对准器小得多的位数。 第二数据路径包括控制逻辑,耦合到控制逻辑的第二对准器,耦合到第二对准器的第二加法器以及耦合到第二加法器的第二归一化器。 控制逻辑提供响应于一对指数的每个指数的第一预定数量位数的控制信号。 一对指数是对于第二数据路径的一对输入的指数。 第二对准器响应于由控制逻辑提供的控制信号。 此外,第二归一化器能够将尾数移动比第二对准器大得多的位数。

    Floating point split multiply/add system which has infinite precision
    3.
    发明授权
    Floating point split multiply/add system which has infinite precision 失效
    具有无限精度的浮点分割乘法/加法系统

    公开(公告)号:US5880983A

    公开(公告)日:1999-03-09

    申请号:US620733

    申请日:1996-03-25

    IPC分类号: G06F7/544 G06F7/38

    摘要: A method and system for an infinite precision split multiply and add operation which has increased speed. The method and system for providing a split multiply and add of a plurality of operands include a multiplier and an adder means. The multiplier multiplies a first portion of the plurality of operands, thereby providing a product. The adder, which combines the remaining operands and the product, comprise at least one pair of data paths. Each pair of data paths comprises a first data path and a second data path. The first data path comprises a first aligner, a first adder, and a first normalizer capable of shifting a mantissa by a substantially fewer number digits than the aligner. The second data path comprises a second aligner, a second adder, and a second normalizer capable of shifting a mantissa by a substantially larger number of digits than the aligner. Accordingly, the present invention includes split multiply and add data paths which, individually, are faster than a fused multiply and add. In addition, the split multiply and add data paths can preserve the appearance of infinite precision. Consequently, overall system performance is increased.

    摘要翻译: 一种用于无限精密分割乘法和加法运算的方法和系统,其具有增加的速度。 用于提供多个操作数的分割乘法和相加的方法和系统包括乘法器和加法器装置。 乘法器乘以多个操作数的第一部分,从而提供乘积。 组合剩余操作数和乘积的加法器包括至少一对数据路径。 每对数据路径包括第一数据路径和第二数据路径。 第一数据路径包括第一对准器,第一加法器和第一归一化器,其能够将尾数与对准器相比更少的数字位移。 第二数据路径包括第二对准器,第二加法器和第二归一化器,其能够将尾数移位比对准器大得多的位数。 因此,本发明包括分离的乘法和加法数据路径,其分别比融合乘法和加法更快。 此外,拆分乘法和添加数据路径可以保持无限精度的外观。 因此,整体系统性能提高。

    Method and system for high performance dynamic and user programmable
cache arbitration
    5.
    发明授权
    Method and system for high performance dynamic and user programmable cache arbitration 失效
    高性能动态和用户可编程高速缓存仲裁的方法和系统

    公开(公告)号:US5822758A

    公开(公告)日:1998-10-13

    申请号:US709793

    申请日:1996-09-09

    IPC分类号: G06F12/08 G06F13/18 G06F12/00

    CPC分类号: G06F12/0897 G06F13/18

    摘要: A system and method for improving arbitration of a plurality of events that may require access to a cache is disclosed. In a first aspect, the method and system provide dynamic arbitration. The first aspect comprises first logic for determining whether at least one of the plurality of events requires access to the cache and for outputting at least one signal in response thereto. Second logic coupled to the first logic determines the priority of each of the plurality of events in response to the at least one signal and outputs a second signal specifying the priority of each event. Third logic coupled to the second logic grants access to the cache in response to the second signal. A second aspect of the method and system provides user programmable arbitration. The second aspect comprises a storage unit which allows the user to input information indicating the priority of at least one of the plurality of events and outputs a first signal in response to the information. In the second aspect, first logic coupled to the storage unit determines the priority of each of the plurality of events in response to the first signal and outputs a second signal indicating the priority of each event. Second logic coupled to the first logic grants access to the cache in response to the second signal.

    摘要翻译: 公开了一种用于改善可能需要访问高速缓存的多个事件的仲裁的系统和方法。 在第一方面,该方法和系统提供动态仲裁。 第一方面包括用于确定多个事件中的至少一个是否需要访问高速缓冲存储器并且响应于此来输出至少一个信号的第一逻辑。 耦合到第一逻辑的第二逻辑响应于至少一个信号确定多个事件中的每一个的优先级,并且输出指定每个事件的优先级的第二信号。 耦合到第二逻辑的第三逻辑响应于第二信号而允许对高速缓存的访问。 该方法和系统的第二方面提供用户可编程仲裁。 第二方面包括存储单元,其允许用户输入指示多个事件中的至少一个的优先级的信息,并且响应于该信息输出第一信号。 在第二方面,耦合到存储单元的第一逻辑响应于第一信号确定多个事件中的每一个的优先级,并且输出指示每个事件的优先级的第二信号。 耦合到第一逻辑的第二逻辑响应于第二信号而允许对高速缓存的访问。

    Fast alignment unit for multiply-add floating point unit
    6.
    发明授权
    Fast alignment unit for multiply-add floating point unit 失效
    用于多重加法浮点单元的快速对准单元

    公开(公告)号:US5790444A

    公开(公告)日:1998-08-04

    申请号:US727331

    申请日:1996-10-08

    摘要: A floating point arithmetic unit performs a multiply-add function B+(A*C) in which an alignment shifter is responsive to an input signal representative of the B mantissa. The shifter includes a sequential stack of multiplexers, typically three (3), for shifting the B mantissa to align it with the A*C product, and a complementer contained between two of the multiplexers to invert the signals when B is a negative number. A shift amount generator responsive to the A, B and C exponents produces control signals for the multiplexers. The shift amount generator includes a multiple input adder utilizing carry save adder and carry lookahead adder techniques to minimize delay, and separate decoders for each multiplexer or group of multiplexers. The generator also includes a Leading Zeros Anticipator (LZA) circuit for the most significant bits to limit shift amount signals that are within the shifting range of the shifter, which reduces the delay attributed to the carry lookahead adder. The multiplexers are arranged in a sequence such that the control signals for the first multiplexers are dependent only on the least significant bits and thus can be generated earliest, and therefore the delay of these multiplexers and the delay of the complementer is in parallel with the delay for producing the control signals to the last multiplexers.

    摘要翻译: 浮点算术单元执行其中对准移位器响应于代表B尾数的输入信号的加法函数B +(A * C)。 移位器包括一组多路复用器,通常为三(3),用于移位B尾数以将其与A * C乘积对齐,以及包含在两个多路复用器之间的补码器,以在B为负数时反转信号。 响应于A,B和C指数的移位量发生器产生用于多路复用器的控制信号。 移位量产生器包括利用进位存储加法器和进位前置加法器技术来最小化延迟的多输入加法器,以及用于每个多路复用器或多路复用器组的单独解码器。 该发生器还包括一个用于最高有效位的前导零点预期器(LZA)电路,用于限制在移位器的移位范围内的移位量信号,这减少了归因于进位前瞻加法器的延迟。 多路复用器按照这样的顺序排列,使得用于第一多路复用器的控制信号仅依赖于最低有效位,并且因此可以最早生成,因此这些多路复用器的延迟和补码器的延迟与延迟并行 用于产生到最后一个多路复用器的控制信号。

    Updating condition status register based on instruction specific modification information in set/clear pair upon instruction commit in out-of-order processor
    7.
    发明授权
    Updating condition status register based on instruction specific modification information in set/clear pair upon instruction commit in out-of-order processor 失效
    基于无序处理器中的指令提交,基于设置/清除对中的指令特定修改信息更新状态寄存器

    公开(公告)号:US06484251B1

    公开(公告)日:2002-11-19

    申请号:US09417824

    申请日:1999-10-14

    IPC分类号: G06F938

    摘要: A processor including a register, an execution unit, a temporary result buffer, and a commit function circuit. The register includes at least one register bit and may include one or more sticky bits. The execution unit is suitable for executing a set of computer instructions. The temporary result buffer is configured to receive, from the execution unit, register bit modification information provided by the instructions. The temporary result buffer is suitable for storing the modification information in set/clear pairs of bits corresponding to respective register bits of the register. The commit function circuit is configured to receive the set/clear pairs of bits from the temporary result buffer when the instruction is committed. The commit function circuit is suitable for generating an updated bit in response to receiving the set/clear pairs of bits. The updated bit is then committed to the corresponding register bit of the register.

    摘要翻译: 一种包括寄存器,执行单元,临时结果缓冲器和提交函数电路的处理器。 寄存器包括至少一个寄存器位,并且可以包括一个或多个粘性位。 执行单元适用于执行一组计算机指令。 临时结果缓冲器被配置为从执行单元接收由指令提供的寄存器位修改信息。 临时结果缓冲器适用于将修改信息存储在与寄存器的各个寄存器位对应的置位/清除位中。 提交函数电路被配置为在提交指令时从临时结果缓冲器接收置位/清除的位对。 提交函数电路适于响应于接收到置位/清除位对而产生更新的位。 更新的位然后被提交到寄存器的相应寄存器位。

    Method and system for processing multiple branch instructions that write
to count and link registers
    8.
    发明授权
    Method and system for processing multiple branch instructions that write to count and link registers 失效
    用于处理写入计数和链接寄存器的多个分支指令的方法和系统

    公开(公告)号:US5943494A

    公开(公告)日:1999-08-24

    申请号:US486304

    申请日:1995-06-07

    IPC分类号: G06F9/32 G06F9/38 G06F9/42

    摘要: A system and method for processing count and link branch instructions that allows multiple branches to be outstanding at the same time without being limited to the number of rename registers allocated to the count and link registers. The method and system comprises an architected count register and an architected link register that are each connected to a look-ahead register. Information in the architected count or link register is copied into the look-ahead register when a branch instruction is encountered that will alter the contents of the count or link registers. Information in the look-ahead register is saved in a shadow register when an unresolved branch is encountered, and restored by the shadow register if the outcome of the unresolved branch is mispredicted.

    摘要翻译: 用于处理计数和链接分支指令的系统和方法,其允许多个分支在同一时间未完成,而不限于分配给计数和链接寄存器的重命名寄存器的数量。 该方法和系统包括各自连接到预先注册的架构计数寄存器和架构的链接寄存器。 当遇到将改变计数或链接寄存器的内容的分支指令时,将结构计数或链接寄存器中的信息复制到预读寄存器中。 当遇到未解决的分支时,预览寄存器中的信息保存在影子寄存器中,如果未解析的分支的结果被错误预测,则由影子寄存器还原。

    Apparatus and method for maintaining status flags and condition codes
using a renaming technique in an out of order floating point execution
unit
    9.
    发明授权
    Apparatus and method for maintaining status flags and condition codes using a renaming technique in an out of order floating point execution unit 失效
    用于在有序的浮点执行单元中使用重命名技术来维护状态标志和条件代码的装置和方法

    公开(公告)号:US5826070A

    公开(公告)日:1998-10-20

    申请号:US708006

    申请日:1996-08-30

    IPC分类号: G06F9/32 G06F9/38 G06F9/302

    摘要: An apparatus and method reduces the number of rename registers for a floating point status and control register (FPSCR) in a superscalar microprocessor executing out of order/speculative instructions. A floating point queue (FPQ) receives speculative instructions and issues out-of-order instructions to FPQ execution units, each instruction containing a group identifier tag (GID) and a target identifier tag (TID). The GID tag indicates a set of instructions bounded by interruptible or branch instructions. The TID indicates a targeted architected facility and the program order of the instruction. The FPSCR contains status and control bits for each instruction and is updated when an instruction is executed and committed. A FPSCR renaming mechanism assigns an FPSCR rename to selected FPSCR bits during instruction dispatch from an instruction fetch unit (IFU) to the FPQ when an arithmetic instruction is dispatched that has a GID which has not been committed by instruction dispatch unit (IDU) and does not already have an FPSCR rename assigned, as determined by the FPQ. The FPSCR rename mechanism utilizes the TID upon the presence of selected bits in the FPSCR. The bits in the FPSCR rename are updated as a new arithmetic instruction enters a write-back stage in the FPU. The resulting FPSCR updates of all instructions in a given GID are merged into one FPSCR rename register. A FPSCR rename register exists for each GID rather than a FPSCR rename register for each FPR rename register as in the prior art.

    摘要翻译: 一种装置和方法减少了执行无序/推测性指令的超标量微处理器中浮点状态和控制寄存器(FPSCR)的重命名寄存器的数量。 浮点队列(FPQ)接收推测指令并向FPQ执行单元发出无序指令,每个指令包含组标识符标签(GID)和目标标识符标签(TID)。 GID标签指示一组由可中断或分支指令限定的指令。 TID表示目标架构设施和指令的程序顺序。 FPSCR包含每条指令的状态和控制位,并在指令执行并提交时更新。 当调度具有尚未由指令分派单元(IDU)提交的GID的算术指令时,FPSCR重命名机制在从指令获取单元(IFU)到FPQ的指令分派期间将FPSCR重命名分配给所选择的FPSCR位,并且 尚未由FPQ确定的FPSCR重命名分配。 FPSCR重命名机制在FPSCR中存在选定位时利用TID。 FPSCR重命名中的位随着新的算术指令进入FPU中的回写阶段而被更新。 给定GID中的所有指令的结果FPSCR更新被合并到一个FPSCR重命名寄存器中。 对于每个GID而言,对于每个FPR重命名寄存器,存在针对每个GID的FPSCR重命名寄存器,如现有技术中那样。

    Method and system for minimizing the delay in executing
branch-on-register instructions
    10.
    发明授权
    Method and system for minimizing the delay in executing branch-on-register instructions 失效
    用于最小化执行分支指令指令的延迟的方法和系统

    公开(公告)号:US5802346A

    公开(公告)日:1998-09-01

    申请号:US457714

    申请日:1995-06-02

    IPC分类号: G06F9/32 G06F9/38

    CPC分类号: G06F9/322 G06F9/3824

    摘要: A system and method for minimizing the delay associated with executing a register dependent instruction in which the execution of the register dependent instruction is dependent on an operand of a preceding instruction. In a branch unit for executing register dependent instructions, functional units are connected via a rename bus, and the functional units are connected to a general purpose register (GPR) via a GPR bus. The system and method routes the rename bus and the GPR bus directly to an instruction fetch address register thereby enabling the branch unit to execute a register dependent instruction during the same cycle as the preceding instruction.

    摘要翻译: 一种用于最小化与执行依赖于寄存器的指令相关联的延迟的系统和方法,其中执行依赖于寄存器的指令取决于前一指令的操作数。 在用于执行与寄存器有关的指令的分支单元中,功能单元经由重命名总线连接,功能单元通过GPR总线连接到通用寄存器(GPR)。 系统和方法将重命名总线和GPR总线直接传送到指令获取地址寄存器,从而使分支单元在与前一指令相同的周期内执行与寄存器相关的指令。