Computer processor system for executing RXE format floating point
instructions
    1.
    发明授权
    Computer processor system for executing RXE format floating point instructions 失效
    用于执行RXE格式浮点指令的计算机处理器系统

    公开(公告)号:US6085313A

    公开(公告)日:2000-07-04

    申请号:US070198

    申请日:1998-04-30

    IPC分类号: G06F9/355 G06F9/30 G06F9/38

    CPC分类号: G06F9/30145

    摘要: A computer processor system having a floating point processor for instructions which are processed in a six cycle pipeline, in which prior to the first cycle of the pipeline an instruction text is fetched, and during the first cycle for the fetched particular instruction it is decoded and the base (B) and index (X) register values are read for use in address generation. Instructions of the RX-type are extended by placing the extension of the operation code beyond the first four bytes of the instruction format and to assign the operation codes in such a way that the machine may determine from the first 8 bits of the operation code alone, the exact format of the instruction. Instructions formats include the ESA/390 instructions SS, RR; RX; S; RRE; RI: and the new RXE instructions. where instructions of the RXE format have their R.sub.1, X.sub.2, B.sub.2, and D.sub.2 fields in the identical positions in said instruction register as in the RX format to enable the processor to determine from the first 8 bits of the operation code alone that an instruction being decoded is an RXE format instruction and the register indexed extensions of the RXE format instruction, after which it gates the correct information to said X-B-D adder. During the second cycle the address add of B+X+Displacement is performed and sent to the cache processor's, and during the third and fourth cycles the cache is respectively accessed and data is returned, and during a fifth cycle execution of the fetched instruction occurs with the result putaway in a sixth cycle.RXE instructions can be used for floating point processing and fixed point processing.

    摘要翻译: 一种计算机处理器系统,具有用于指令的浮点处理器,其在六个周期流水线中被处理,其中在流水线的第一周期之前取出指令文本,并且在所读取的特定指令的第一周期期间对其进行解码, 读取基地址(B)和索引(X)寄存器值以用于地址生成。 通过将操作代码的扩展置于指令格式的前四个字节之外来扩展RX类型的指令,并且以这样的方式分配操作代码,使得机器可以仅从操作代码的前8位确定 ,指令的确切格式。 指令格式包括ESA / 390指令SS,RR; RX; S; RRE; RI:和新的RXE指令。 其中RXE格式的指令在RX格式中在所述指令寄存器中的相同位置具有它们的R1,X2,B2和D2字段,以使处理器仅从操作代码的前8位确定指令为 解码的是RXE格式指令和RXE格式指令的寄存器索引扩展,然后将正确的信息写入所述XBD加法器。 在第二周期期间,执行B + X +位移的地址添加并发送到高速缓存处理器,并且在第三和第四周期期间,分别访问高速缓存并返回数据,并且在第五周期期间执行所提取的指令 结果放在第六个循环中.RXE指令可用于浮点处理和定点处理。

    Address bit decoding for same adder circuitry for RXE instruction format
with same XBD location as RX format and dis-jointed extended operation
code
    2.
    发明授权
    Address bit decoding for same adder circuitry for RXE instruction format with same XBD location as RX format and dis-jointed extended operation code 失效
    地址比特解码用于RXE指令格式的相同加法器电路,具有与RX格式相同的XBD位置和解码的扩展操作码

    公开(公告)号:US6105126A

    公开(公告)日:2000-08-15

    申请号:US70359

    申请日:1998-04-30

    CPC分类号: G06F9/355 G06F9/30185

    摘要: A computer processor floating point processor six cycle pipeline system where instruction text is fetched prior to the first cycle and decoded during the first cycle for the fetched particular instruction and the base (B) and index (X) register values are read for use in address generation. RXE Instructions are of the RX-type but extended by placing the extension of the operation code beyond the first four bytes of the instruction format and to assign the operation codes in such a way that the machine may determine the exact format from the first 8 bits of the operation code alone. ESA/390 instructions SS, RR; RX; S; RRE; RI; and the new RXE instructions have a format which can be used for fixed point processing as well as floating point processing where instructions of the RXE format have their R1, X2, B2, and D2 fields in the identical positions in said instruction register as in the RX format to enable the processor to determine from the first 8 bits of the operation code alone that an instruction being decoded is an RXE format instruction and the register indexed extensions of the RXE format instruction, after which it gates the correct information to said X-B-D adder. During the second cycle the address add of B+X+Displacement is performed and sent to the cache processor's, and during the third and fourth cycles the cache is respectively accessed and data is returned, and during a fifth cycle execution of the fetched instruction occurs with the result putaway in a sixth cycle.

    摘要翻译: 计算机处理器浮点处理器六循环流水线系统,其中指令文本在第一周期之前获取并且在第一周期期间被解码用于所提取的特定指令,并且基准(B)和索引(X)寄存器值被读取用于地址 代。 RXE指令是RX型,但通过将操作码的扩展置于指令格式的前四个字节之外进行扩展,并以这样的方式分配操作码,使得机器可以从前8位确定确切的格式 的操作代码。 ESA / 390指令SS,RR; RX; S; RRE; RI; 并且新的RXE指令具有可用于固定点处理以及浮点处理的格式,其中RXE格式的指令在所述指令寄存器中的相同位置具有其R1,X2,B2和D2字段,如 RX格式,使处理器能够从操作代码的前8位确定正在解码的指令是RXE格式指令和RXE格式指令的寄存器索引扩展,之后它将正确信息锁定到所述XBD加法器 。 在第二周期期间,执行B + X +位移的地址添加并发送到高速缓存处理器,并且在第三和第四周期期间,分别访问高速缓存并返回数据,并且在第五周期期间执行所取出的指令 结果放在第六个周期。

    Method and system for executing denormalized numbers
    3.
    发明授权
    Method and system for executing denormalized numbers 失效
    执行非正规化数字的方法和系统

    公开(公告)号:US5903479A

    公开(公告)日:1999-05-11

    申请号:US922191

    申请日:1997-09-02

    IPC分类号: G06F5/01 G06F9/38

    摘要: A method and system for processing instructions in a floating point unit for executing denormalized numbers in floating point pipeline via serializing uses an instruction unit and having a control unit and a pipelined data flow unit, a shifter and a rounding unit. The floating point unit has an external feedback path for providing intermediate result data from said rounding unit to an input of the pipelined data flow unit to reuse the pipeline for denormalization by passing intermediate results in the pipeline which have a denormalized condition computed after the exponent calculation of the shifting circuit directly from the rounding unit to the top of the dataflow in the pipeline via an external feedback path. The pipelined has two paths which are selected based on the presence of other instructions in the pipeline. If no other instructions are in the pipeline a first path is taken which uses the external feedback path from the rounding unit back into the top of the dataflow. When there are instructions in the pipeline a shifter unit performing normalization of the fraction indicates possible underflow of the exponent, and prepares to hold the exponent and the fraction in a floating point data flow register; and upon detection of exponent underflow during the rounder stage and detection of any other instructions in pipeline; then the control unit forces an interrupt for serialization, and cancels execution of this instruction and other instructions in pipeline.

    摘要翻译: 用于处理浮点单元中的指令的方法和系统,用于通过串行化来执行浮点流水线中的非正规化数字,使用指令单元并具有控制单元和流水线数据流单元,移位器和舍入单元。 浮点单元具有用于将来自所述舍入单元的中间结果数据提供给流水线数据流单元的输入的外部反馈路径,以通过将具有在指数计算之后计算的非归一化状态的流水线中的中间结果重新使用来进行非规范化 的移位电路通过外部反馈路径直接从舍入单元到流水线中的数据流的顶部。 流水线有两个路径,这些路径是根据流水线中其他指令的存在而选择的。 如果没有其他指令在流水线中,则采用第一路径,其使用从舍入单元返回到数据流的顶部的外部反馈路径。 当在流水线中存在指令时,执行分数的归一化的移位单元指示指数的可能下溢,并准备将指数和分数保持在浮点数据流寄存器中; 并且在更整理阶段检测到指数下溢并检测管道中的任何其他指令; 那么控制单元强制中断进行串行化,并取消执行该指令和其他指令。

    IEEE compliant floating point unit
    4.
    发明授权
    IEEE compliant floating point unit 失效
    符合IEEE标准的浮点单元

    公开(公告)号:US6044454A

    公开(公告)日:2000-03-28

    申请号:US26328

    申请日:1998-02-19

    摘要: IEEE compliant floating point unit mechanism allows variability in the execution of floating point operations according to the IEEE 754 standard and allowing variability of the standard to co-exist in hardware or in the combination of hardware and millicode. The FPU has a detector of special conditions which dynamically detects an event that the hardware execution of an IEEE compliant Binary Floating Point instruction will require millicode emulation. The complete set of events which millicode may emulate are predetermined early in the design process of the hardware. An exception handling unit assist millicode emulation by trapping the result of an exceptional condition without invoking the trap handler. When an exceptional condition is detected during execution, the IEEE 754 standard requires two different actions under control of a mask bit. If the mask bit is on, the result is written into an FPR and the trap handler is invoked. Otherwise, a default value is written, a flag is set, and the program continues execution. This allows a variation to the IEEE 754 standard. Two different versions of the function of the Multiply-then-Substract instruction are implemented for two different IEEE 754 compliant architectures.

    摘要翻译: 符合IEEE标准的浮点单元机制允许根据IEEE 754标准执行浮点运算的可变性,并允许标准的可变性在硬件或硬件和毫代数的组合中共存。 FPU具有特殊条件检测器,可动态检测符合IEEE标准的二进制浮点指令的硬件执行需要进行微码仿真的事件。 在硬件的设计过程的早期,预先确定了一系列可能模拟的事件。 异常处理单元通过捕获特殊条件的结果而不调用陷阱处理程序来辅助millicode仿真。 当在执行期间检测到异常情况时,IEEE 754标准在屏蔽位的控制下需要两个不同的动作。 如果掩码位打开,则将结果写入FPR,并调用陷阱处理程序。 否则,将写入默认值,设置一个标志,程序继续执行。 这允许对IEEE 754标准的变化。 对于两种不同的符合IEEE 754标准的架构,实现了两种不同版本的“乘法 - 再次抽取”指令的功能。

    Implementation of binary floating point using hexadecimal floating point
unit
    5.
    发明授权
    Implementation of binary floating point using hexadecimal floating point unit 失效
    使用十六进制浮点单元实现二进制浮点数

    公开(公告)号:US5687106A

    公开(公告)日:1997-11-11

    申请号:US414250

    申请日:1995-03-31

    IPC分类号: G06F7/57 G06F7/38 G06F7/00

    摘要: A computer system supporting multiple floating point architectures. In an embodiment of the invention, a floating point unit (FPU) is optimized for hex format. The FPU uses a hex internal dataflow with a with an exponent and bias sufficient to support a binary floating point architecture. The FPU includes format conversion means, rounding means, sticky bit calculation means, and special number control means to execute binary floating point operations according to the IEEE 754 standard. An embodiment of the invention provides a system for executing floating point operations in either IBM S/390 hexadecimal format or IEEE 754 binary format.

    摘要翻译: 支持多种浮点架构的计算机系统。 在本发明的实施例中,针对十六进制格式优化了浮点单元(FPU)。 FPU使用十六进制内部数据流,其中指数和偏差足以支持二进制浮点架构。 FPU包括格式转换装置,舍入装置,粘点计算装置和专用号码控制装置,以根据IEEE 754标准执行二进制浮点运算。 本发明的实施例提供了一种用于以IBM S / 390十六进制格式或IEEE 754二进制格式执行浮点运算的系统。

    Carry select and input select adder for late arriving data

    公开(公告)号:US5654911A

    公开(公告)日:1997-08-05

    申请号:US472962

    申请日:1995-06-07

    IPC分类号: G06F7/50 G06F7/507

    CPC分类号: G06F7/507

    摘要: An adder which takes advantage of the early arriving bits of a time skewed operand to provide a result to an add or substract operation without additional latency. Possible partial results are calculated and then selectively combined according to the late arriving data as the late arriving data becomes available. In an embodiment of the present invention, a first operand is partitioned into groups according to the arrival time of the skewed data, and possible partial results for each group are calculated for the full range of partial inputs that affect it. In addition, the high order groups are calculated with and without a borrow (carry) which is propagated from a low order group. Once the delayed partial operands are known and the borrows (carrys) determined the partial results are gated through multiplexers according to the borrows and partial results, and thus the result is provided with a delay similar to the delay in arrival of the skewed operand.

    Partitioning of binary quad word format multiply instruction on S/390
processor
    7.
    发明授权
    Partitioning of binary quad word format multiply instruction on S/390 processor 失效
    在S / 390处理器上分配二进制四字格式乘法指令

    公开(公告)号:US6021422A

    公开(公告)日:2000-02-01

    申请号:US33626

    申请日:1998-03-05

    申请人: Eric Mark Schwarz

    发明人: Eric Mark Schwarz

    IPC分类号: G06F7/52 G06F7/44 G06F7/38

    摘要: There is a unique partitioning problem in determining how to execute the floating point multiply instruction defined by IEEE 754 standard for the quad word format on a S/390 processor. Several manufacturers including IBM and HP define the binary quad word format to have a 113 bit significand. IBM S/390 hexadecimal long floating point format has a 56 bit significand and most S/390 floating point units only contain a long format multiplier. Quad word format multiplication must be executed as a series of several long precision multiplications and extended precision or long precision additions. The S/390 hexadecimal quad word format is easier to implement than binary format since it has a 112 bit significand and can easily be partitioned into two 56 bit parts. But a 113 bit significand would just exceed two partitions and require a third. For extended precision multiplies each partition is multiplied by each other, so if there are two partitions only four multiplies are required but for three partitions this increases to nine multiplies. Methods for partitioning are disclosed here.

    摘要翻译: 确定如何在S / 390处理器上执行由IEEE 754标准定义的四字格式的浮点乘法指令,存在独特的划分问题。 包括IBM和HP在内的几家制造商将二进制四字格式定义为具有113位有效位数。 IBM S / 390十六进制长浮点格式具有56位有效位数,大多数S / 390浮点单元仅包含长格式乘数。 四字格式乘法必须作为一系列长精度乘法和扩展精度或长精度加法执行。 S / 390十六进制四进制字格式比二进制格式更容易实现,因为它具有112位有效位数,并且可以轻松地分为两个56位的部分。 但是一个113位的有效位数只会超过两个分区,需要三分之一。 对于扩展精度乘法,每个分区彼此相乘,因此如果有两个分区只需要四个乘法,但是对于三个分区,这增加到九个乘法。 这里公开了划分方法。

    Parallel calculation of exponent and sticky bit during normalization
    8.
    发明授权
    Parallel calculation of exponent and sticky bit during normalization 失效
    在归一化期间并行计算指数和粘点

    公开(公告)号:US5757682A

    公开(公告)日:1998-05-26

    申请号:US414072

    申请日:1995-03-31

    摘要: A system implementing a methodology for determining the exponent in parallel with determining the fractional shift during normalization according to partitioning the exponent into partial exponent groups according to the fractional shift data flow, determining all possible partial exponent values for each partial exponent group according to the fractional data flow, and providing the exponent by selectively combining possible partial exponents from each partial exponent group according to the fractional data flow. There is also provided a system implementing a methodology for generating the sticky bit during normalization. Sticky bit information is precalculated and multiplexed according to the fractional dataflow. In an embodiment of the invention, group sticky signals are calculated in tree form, each group sticky having a number of possible sticky bits corresponding to the shift increment amount of the multiplexing. The group sticky bits are further multiplexed according to subsequent shift amounts in the fractional dataflow to provide an output sticky bit at substantially the same time as when the final fractional shift amount is available, and thereby at substantially the same time as the normalized fraction.

    摘要翻译: 根据分数位移数据流,根据将指数分解成部分指数组,实现用于在归一化期间确定分数移位的方法来确定指数的方法的系统,根据分数确定每个部分指数组的所有可能的部分指数值 数据流,并且通过根据分数据流选择性地组合来自每个部分指数组的可能部分指数来提供指数。 还提供了一种实现在归一化过程中产生粘性位的方法的系统。 粘滞位信息根据分数据流进行预先计算和复用。 在本发明的一个实施例中,以树形式计算组粘性信号,每组粘性具有与多路复用的移位增量量相对应的多个可能的粘性位。 组粘性位根据分数据流中的随后的移位量进一步复用,以在与最终分数移位量可用时基本相同的时间提供输出粘性位,并且因此与归一化分数基本上相同。

    Floating point binary quad word format multiply instruction unit
    9.
    发明授权
    Floating point binary quad word format multiply instruction unit 失效
    浮点二进制四字格式乘法指令单元

    公开(公告)号:US6055554A

    公开(公告)日:2000-04-25

    申请号:US34718

    申请日:1998-03-04

    申请人: Eric Mark Schwarz

    发明人: Eric Mark Schwarz

    CPC分类号: G06F7/4876 G06F7/5324

    摘要: An IEEE 754 standard floating point multiply instruction for binary extended precision format can be executed with a quad word format on an S/390 process. The multiplication calculation multiplies each partition by each other. In the multiplication calculation process dataflow process of either operand is a denormalized number, they are normalized at a stage which creates an expanded exponent range of one more bit, and the calculation continues to a parallel path multiplexor stage, but if neither operand is denormalized then the exponent of the number is expended and the calculation splits into four parallel paths, wherein two operand's sign bits are processed in a sign calculation block stage, the operands' two 16 bit binary exponents are processed by an exponent conversion block stage, and a partition multiplicand significand block stage receives a 113 bit multiplicand significand input for a fourth path. In this calculation third and fourth paths converge with a calculation which provides partial products and intermediate sums and finally a final product as a calculation block stage output, and this output and the exponent from said second path and the sign bit from said first path merge to provide a product which is represented in hexadecimal internal format and is converted back to binary format in calculation block stage and rounded.

    摘要翻译: 用于二进制扩展精度格式的IEEE 754标准浮点乘法指令可以在S / 390进程上以四字格式执行。 乘法计算将每个分区彼此相乘。 在乘法计算过程中,任一操作数的数据流处理是非规范化数,它们在创建一个多位的扩展指数范围的阶段进行归一化,并且计算继续到并行路径多路复用器阶段,但是如果两个操作数都不是非规范化的 数字的指数被消耗,并且计算分成四个并行路径,其中在符号计算块级中处理两个操作数的符号位,操作数的两个16位二进制指数由指数转换块级处理,并且分区 被乘数有效位块接收第四路径的113位被乘数有效位数输入。 在该计算中,第三和第四路径与提供部分乘积和中间和的计算收敛,最终将最终乘积作为计算块级输出收敛,并且来自所述第二路径的输出和来自所述第一路径的符号位的输出合并为 提供以十六进制内部格式表示的产品,并在计算块阶段转换回二进制格式并舍入。

    Preprocessing of stored target routines for emulating incompatible
instructions on a target processor
    10.
    发明授权
    Preprocessing of stored target routines for emulating incompatible instructions on a target processor 失效
    用于在目标处理器上模拟不兼容指令的存储目标程序的预处理

    公开(公告)号:US6009261A

    公开(公告)日:1999-12-28

    申请号:US991714

    申请日:1997-12-16

    IPC分类号: G06F9/455

    CPC分类号: G06F9/45504

    摘要: Provides a program translation and execution method which stores target routines (for execution by a target processor) corresponding to incompatible instructions, interruptions and authorizations of an incompatible program written for execution on another computer system built to a computer architecture incompatible with the architecture of the target processor's computer system. The disclosed process allows the target processor to emulate incompatible acts expected in the operation of an incompatible program when the target processor itself is incapable of performing the emulated acts. Each of the instructions, interruptions and authorizations found in the incompatible programs has one or more corresponding target routines, any of which may need to be preprocessed before it can precisely emulate the execution results required by the incompatible architecture. Target routines (corresponding to the incompatible instruction instances in an incompatible program being emulated) are accessed, patched where necessary, and executed by a target processor to enable the target processor to precisely obtain the execution results of the emulated incompatible program. Before preprocessing, each target routine may not be able to provide identical execution results as required by the incompatible architecture, and the preprocessing may patch one or more of its target instructions to enable the target routine to perform the identical emulation execution of the corresponding incompatible instruction. The patching and other modifications to a target routine are done by one or more preprocessing instructions stored in the target routine.

    摘要翻译: 提供程序转换和执行方法,其存储对应于不兼容的指令,不兼容程序的中断和授权的对象程序(用于由目标处理器执行),该程序被编写用于在另一个计算机系统上执行以执行,该计算机系统与目标架构不兼容 处理器的计算机系统。 当目标处理器本身不能执行仿真动作时,所公开的过程允许目标处理器模拟在不兼容的程序的操作中期望的不兼容的动作。 在不兼容程序中发现的每个指令,中断和授权都有一个或多个相应的目标程序,其中任何一个可能需要进行预处理,才能精确地模拟不兼容架构所需的执行结果。 目标程序(对应于正在仿真的不兼容程序中的不兼容指令实例)被访问,必要时进行修补,并由目标处理器执行,以使目标处理器能够精确获取仿真不兼容程序的执行结果。 在预处理之前,每个目标程序可能无法提供与不兼容体系结构相同的执行结果,并且预处理可能会修补其目标指令中的一个或多个,以使目标程序执行相应不兼容指令的相同仿真执行 。 对目标程序的修补和其他修改由存储在目标程序中的一个或多个预处理指令完成。