Ready selection of data dependent instructions using multi-cycle cams in
a processor performing out-of-order instruction execution
    21.
    发明授权
    Ready selection of data dependent instructions using multi-cycle cams in a processor performing out-of-order instruction execution 失效
    在执行无序指令执行的处理器中,使用多周期凸轮准备选择依赖于数据的指令

    公开(公告)号:US5546597A

    公开(公告)日:1996-08-13

    申请号:US203050

    申请日:1994-02-28

    IPC分类号: G06F9/38 G06F9/345

    摘要: An instruction dispatch circuit is disclosed that improves instruction execution throughput for a processor. The instruction dispatch circuit comprises an instruction buffer with a plurality of instruction entries and a content addressable memory array having at least one cam entry corresponding to each instruction entry. Each cam entry stores at least one source tag for the corresponding instruction entry. The content addressable memory array matches to a result tag from an execution circuit over a result bus, wherein the execution circuit transfers the result tag over the result bus at least one clock cycle before transferring a corresponding result data value over the result bus. Each cam entry generates a cam match signal used to determine whether data dependent instruction are ready for dispatch.

    摘要翻译: 公开了一种提高处理器的指令执行吞吐量的指令调度电路。 指令调度电路包括具有多个指令条目的指令缓冲器和具有对应于每个指令条目的至少一个凸轮条目的内容可寻址存储器阵列。 每个凸轮条目存储用于相应指令条目的至少一个源标签。 内容可寻址存储器阵列通过结果总线与来自执行电路的结果标签相匹配,其中执行电路在通过结果总线传送相应的结果数据值之前至少一个时钟周期在结果总线上传送结果标签。 每个凸轮条目产生一个凸轮匹配信号,用于确定数据相关指令是否准备好进行发送。

    Coordinating speculative and committed state register source data and
immediate source data in a processor
    22.
    发明授权
    Coordinating speculative and committed state register source data and immediate source data in a processor 失效
    在处理器中协调推测和提交的状态寄存器源数据和即时源数据

    公开(公告)号:US5452426A

    公开(公告)日:1995-09-19

    申请号:US177240

    申请日:1994-01-04

    IPC分类号: G06F9/38 G06F9/24 G06F9/28

    摘要: A mechanism for coordinating source data in a processor, wherein a decode circuit issues instructions comprising at least one immediate valid flag and at least one logical register source. The immediate valid flag indicates whether an immediate operand for the instruction is available on an immediate data bus, and the logical register source specifies a physical register or a committed state register. A speculative result data value and a speculative source valid flag are read from the physical register, and a committed result data value is read from the committed state register. The speculative result data value and the speculative source valid flag or the committed result data value and the committed source valid flag provide a source data value and a source data valid flag for scheduling an execution of the instruction.

    摘要翻译: 一种用于在处理器中协调源数据的机制,其中解码电路发出包括至少一个即时有效标志和至少一个逻辑寄存器源的指令。 即时有效标志指示在立即数据总线上是否有指令的立即操作数可用,逻辑寄存器源指定物理寄存器或提交状态寄存器。 从物理寄存器读取推测结果数据值和推测源有效标志,并从承诺状态寄存器读取提交结果数据值。 推测结果数据值和推测源有效标志或提交结果数据值和提交的源有效标志提供源数据值和源数据有效标志,用于调度指令的执行。

    Flag renaming and flag masks within register alias table
    24.
    发明授权
    Flag renaming and flag masks within register alias table 失效
    标志在注册表别名中重命名和标记掩码

    公开(公告)号:US06047369A

    公开(公告)日:2000-04-04

    申请号:US204521

    申请日:1994-02-28

    IPC分类号: G06F9/32 G06F9/38 G06F9/30

    摘要: A mechanism and method for renaming flags within a register alias table ("RAT") to increase processor parallelism and also providing and using flag masks associated with individual instructions. In order to reduce the amount of data dependencies between instructions that are concurrently processed, the flags used by these instructions are renamed. In general, a RAT unit provides register renaming to provide a larger physical register set than would ordinarily be available within a given macroarchitecture's logical register set (such as the Intel architecture or PowerPC or Alpha designs, for instance) to eliminate false data dependencies between instructions that reduce overall superscalar processing performance for the microprocessor. The renamed flag registers contain several flag bits and various flag bits may be updated or read by different instructions. Also, static and dynamic flag masks are associated with particular instructions and indicate which flags are capable of being updated by a particular instruction and also indicate which flags are actually updated by the instruction. Static flag masks are used in flag renaming and dynamic flag masks are used at retirement. The invention also discovers cases in which a flag register is required that is a superset of the previously renamed flag register portion.

    摘要翻译: 一种用于重命名寄存器别名表(“RAT”)中的标志以增加处理器并行性并且还提供和使用与各个指令相关联的标志掩码的机制和方法。 为了减少并发处理的指令之间的数据依赖性,这些指令使用的标志被重命名。 通常,RAT单元提供寄存器重命名以提供比通常在给定宏架构的逻辑寄存器集(例如Intel架构或PowerPC或Alpha设计)内通常可用的更大的物理寄存器集,以消除指令之间的虚假数据依赖性 这降低了微处理器的整体超标量处理性能。 重命名的标志寄存器包含几个标志位,各种标志位可能被不同的指令更新或读取。 此外,静态和动态标志掩码与特定指令相关联,并且指示哪些标志能够被特定指令更新,并且还指示哪些标志实际上被指令更新。 在标志重命名中使用静态标志掩码,退休时使用动态标志掩码。 本发明还发现需要作为先前重命名的标志寄存器部分的超集的标志寄存器的情况。

    Method and apparatus for maximum throughput scheduling of dependent
operations in a pipelined processor
    25.
    发明授权
    Method and apparatus for maximum throughput scheduling of dependent operations in a pipelined processor 失效
    用于流水线处理器中依赖操作的最大吞吐量调度的方法和装置

    公开(公告)号:US6101597A

    公开(公告)日:2000-08-08

    申请号:US176370

    申请日:1993-12-30

    IPC分类号: G06F9/38 G06F9/30

    CPC分类号: G06F9/3824 G06F9/383

    摘要: Maximum throughput or "back-to-back" scheduling of dependent instructions in a pipelined processor is achieved by maximizing the efficiency in which the processor determines the availability of the source operands of a dependent instruction and provides those operands to an execution unit executing the dependent instruction. These two operations are implemented through number of mechanisms. One mechanism for determining the availability of source operands, and hence the readiness of a dependent instruction for dispatch to an available execution unit, relies on the prospective determination of the availability of a source operand before the operand itself is actually computed as a result of the execution of another instruction. Storage addresses of the source operands of an instruction are stored in a content addressable memory (CAM). Before an instruction is executed and its result data written back, the storage location address of the result is provided to the CAM and associatively compared with the source operand addresses stored therein. A CAM match and its accompanying match bit indicate that the result of the instruction to be executed will provide a source operand to the dependent instruction waiting in the reservation station. Using a bypass mechanism, if the operand is computed after dispatch of the dependent instruction, then the source operand is provided directly from the execution unit computing the source operand to a source operand input of the execution unit executing the dependent instruction.

    摘要翻译: 通过最大化处理器确定依赖指令的源操作数的可用性的效率,并将这些操作数提供给执行依赖的执行单元,从而实现流水线处理器中相关指令的最大吞吐量或“背对背” 指令。 这两个操作通过多个机制来实现。 用于确定源操作数的可用性以及因此用于发送到可用执行单元的依赖指令的准备的机制依赖于在操作数本身实际计算之前源操作数的可用性的预期确定 执行另一条指令。 指令的源操作数的存储地址存储在内容可寻址存储器(CAM)中。 在执行指令并且其结果数据被写回之前,将结果的存储位置地址提供给CAM并与存储在其中的源操作数地址相关联地进行比较。 CAM匹配及其伴随的匹配位指示要执行的指令的结果将为在保留站等待的从属指令提供源操作数。 使用旁路机制,如果在分派依赖指令之后计算操作数,则将操作数从执行单元直接提供到计算源操作数到执行依赖指令的执行单元的源操作数输入。

    Apparatus and method for handling string operations in a pipelined
processor
    27.
    发明授权
    Apparatus and method for handling string operations in a pipelined processor 失效
    在流水线处理器中处理字符串操作的装置和方法

    公开(公告)号:US5404473A

    公开(公告)日:1995-04-04

    申请号:US204612

    申请日:1994-03-01

    摘要: In a pipelined processor, an apparatus for handling string operations. When a string operation is received by the processor, the length of the string as specified by the programmer is stored in a register. Next, an instruction sequencer issues an instruction that computes the register value minus a pre-determined number of iterations to be issued into the pipeline. Following the instruction, the pre-determined number of iterations are issued to the pipeline. When the instruction returns with the calculated number, the instruction sequencer then knows exactly how many iterations should be executed. Any extra iterations that had initially been issued are canceled by the execution unit, and additional iterations are issued as necessary. A loop counter in the instruction sequencer is used to track the number of iterations.

    摘要翻译: 在流水线处理器中,用于处理字符串操作的装置。 当处理器接收到字符串操作时,由编程器指定的字符串的长度存储在寄存器中。 接下来,指令定序器发出计算寄存器值减去要发布到流水线中的预定数量的迭代的指令。 按照该指令,将预先确定的迭代次数发布到流水线。 当指令以计算出的数字返回时,指令定序器将准确地知道应该执行多少次迭代。 最初发布的任何额外的迭代将被执行单元取消,并根据需要发出额外的迭代。 指令定序器中的循环计数器用于跟踪迭代次数。

    Method and apparatus for pipeline streamlining where resources are immediate or certainly retired
    29.
    发明授权
    Method and apparatus for pipeline streamlining where resources are immediate or certainly retired 失效
    用于管道精简的方法和装置,其中资源是立即的或肯定退休的

    公开(公告)号:US06393550B1

    公开(公告)日:2002-05-21

    申请号:US08532225

    申请日:1995-09-19

    IPC分类号: G06F930

    摘要: Maximum throughput or “back-to-back” scheduling of dependent instructions in a pipelined processor is achieved by maximizing the efficiency in which the processor determines the availability of the source operands of a dependent instruction and provides those operands to an execution unit executing the dependent instruction. These two operations are implemented through a number of mechanisms. One mechanism for determining the availability of source operands, and hence the readiness of a dependent instruction for dispatch to an available execution unit, relies on the early setting of a source valid bit during allocation when a source operand is a retired or immediate value. This allows the ready logic of a reservation station to begin scheduling the instruction for dispatch.

    摘要翻译: 通过最大化处理器确定依赖指令的源操作数的可用性的效率,并将这些操作数提供给执行依赖的执行单元,从而实现流水线处理器中相关指令的最大吞吐量或“背对背” 指令。 这两个操作通过多种机制来实现。 用于确定源操作数的可用性以及因此用于调度到可用执行单元的依赖指令的准备状态的一种机制依赖于在源操作数为退休或即时值时在分配期间早期设置源有效位。 这允许保留站的就绪逻辑开始调度发送指令。