Decode and execution synchronized pipeline processing using decode generated memory read queue with stop entry to allow execution generated memory read
    1.
    发明授权
    Decode and execution synchronized pipeline processing using decode generated memory read queue with stop entry to allow execution generated memory read 失效
    解码和执行同步流水线处理使用解码生成的内存读取队列与停止条目允许执行生成的内存读取

    公开(公告)号:US06240508B1

    公开(公告)日:2001-05-29

    申请号:US08505810

    申请日:1995-07-21

    IPC分类号: G06F938

    摘要: A macropipelined microprocessor chip adheres to strict read and write ordering by sequentially buffering operands in queues during instruction decode, then removing the operands in order during instruction execution. Any instruction that requires additional access to memory inserts the requests into the queued sequence (in a specifier queue) such that read and write ordering is preserved. A specifier queue synchronization counter captures synchronization points to coordinate memory request operations among the autonomous instruction decode unit, instruction execution unit, and memory sub-system. The synchronization method does not restrict the benefit of overlapped execution in the pipelined. Another feature is treatment of a variable bit field operand type that does not restrict the location of operand data. Instruction execution flows in a pipelined processor having such an operand type are vastly different depending on whether operand data resides in registers or memory. Thus, an operand context queue (field queue) is used to simplify context-dependent execution flow and increase overlap. The field queue allows the instruction decode unit to issue instructions with variable bit field operands normally, sequentially identifying and fetching operands, and communicating the operand context that specifies register or memory residence across the pipeline boundaries to the autonomous execution unit. The mechanism creates opportunity for increasing the overlap of pipelined functions and greatly simplifies the splitting of execution flows.

    摘要翻译: 宏指令微处理器芯片通过在指令解码期间依次缓冲队列中的操作数,然后在指令执行期间依次移除操作数,从而遵循严格的读写顺序。 任何需要对内存进行访问的指令将请求插入排队的序列(在指定符队列中),以便保留读写顺序。 指定符队列同步计数器捕获同步点以协调自主指令解码单元,指令执行单元和存储器子系统之间的存储器请求操作。 同步方法不限制流水线重叠执行的好处。 另一个特征是处理不限制操作数数据位置的可变位字段操作数类型。 具有这种操作数类型的流水线处理器中的指令执行流程根据操作数数据位于寄存器或存储器中而大不相同。 因此,操作数上下文队列(字段队列)用于简化上下文相关的执行流程并增加重叠。 字段队列允许指令解码单元通常发送具有可变位字段操作数的指令,顺序地识别和取出操作数,以及将指定流水线边界的寄存器或存储器驻留的操作数上下文传送到自主执行单元。 该机制为增加流水线功能的重叠创造了机会,并大大简化了执行流程的拆分。

    Computer system performance evaluation system and method
    2.
    发明授权
    Computer system performance evaluation system and method 失效
    计算机系统性能评估系统及方法

    公开(公告)号:US5450349A

    公开(公告)日:1995-09-12

    申请号:US967110

    申请日:1992-10-27

    IPC分类号: G06F11/267 G06F11/22

    CPC分类号: G06F11/2236

    摘要: A system for evaluating the performance of a computer system having a processor that passes through a plurality of processor states during operation and an associated system memory includes an operating unit for receiving a request to monitor specific process states from a user. Firmware causes the processor to enter the desired processor state requested by the user. The hardware identifies the occurrence of the desired processor state. Information relating to the occurrence of the desired process state is accumulated the memory. The accumulated information is read from memory and a report is provided to the user.

    摘要翻译: 一种用于评估在操作期间具有通过多个处理器状态的处理器的计算机系统的性能的系统,以及相关联的系统存储器包括用于从用户接收监视特定处理状态的请求的操作单元。 固件使处理器进入用户请求的所需处理器状态。 硬件识别出所需的处理器状态。 与所期望的处理状态的发生有关的信息被存储。 从存储器读取累积的信息,并向用户提供报告。

    Method for implementing synchronous pipeline exception recovery
    3.
    发明授权
    Method for implementing synchronous pipeline exception recovery 失效
    实现同步管道异常恢复的方法

    公开(公告)号:US4875160A

    公开(公告)日:1989-10-17

    申请号:US221934

    申请日:1988-07-20

    IPC分类号: G06F9/38 G06F11/00

    摘要: Pipelined CPUs achieve high-performance by fine tuning the pipe stages to execute typical instruction sequences. Atypical instruction sequences result in pipeline exceptions. The disclosed method provides graceful exception handling and recovery in a micropipelined memory interface. The use of a memory reference restart command latch allows an implementation that requires no additional logic for conditional writing of states pending exception checking. The exception handling hardware is minimized because instructions which cause exceptions are never re-executed, and exception handling microcode executes in-line with the normal microcode flow.

    摘要翻译: 流水线CPU通过微调管道级来执行典型的指令序列来实现高性能。 非典型指令序列导致流水线异常。 所公开的方法在微流水线存储器接口中提供优雅的异常处理和恢复。 使用存储器引用重启命令锁存器允许不需要附加逻辑的实现,用于条件写状态等待异常检查。 异常处理硬件被最小化,因为导致异常的指令永远不会被重新执行,并且异常处理微码与正常的微代码流一致地执行。

    Branch prediction unit for high-performance processor
    4.
    发明授权
    Branch prediction unit for high-performance processor 失效
    用于高性能处理器的分支预测单元

    公开(公告)号:US5394529A

    公开(公告)日:1995-02-28

    申请号:US86355

    申请日:1993-07-01

    IPC分类号: F02B75/02 G06F9/38 G06F9/26

    CPC分类号: G06F9/3848 F02B2075/025

    摘要: A pipelined CPU executes instructions of variable length, and references memory using various data widths. Macroinstruction pipelining is employed (instead of microinstruction pipelining), with queueing between units of the CPU to allow flexibility in instruction execution times. A branch prediction method employs a branch history table which records the taken vs. not-taken history of branch opcodes recently used, and uses an empirical aglorithm to predict which way the next occurrence of this branch will go, based upon the history table. The branch history table stores in each entry a number of bits for each branch address, each bits indicating "taken" or "not-taken" for one occurrence of the branch. The table is indexed by branch address. A register stores the empirical aglorithm, and upon occurrence of a branch its history is fetched from the table and used to select a location in the register containing a prediction for this particular pattern of branch history.

    摘要翻译: 流水线CPU执行可变长度的指令,并使用各种数据宽度引用存储器。 使用宏指令流水线(而不是微指令流水线),在CPU的单元之间排队,以允许指令执行时间的灵活性。 分支预测方法采用分支历史表,其记录最近使用的分支操作码的拍摄历史和未拍摄的历史,并且使用经验法则来基于历史表来预测该分支的下一次出现将如何去除。 分支历史表在每个条目中存储每个分支地址的位数,对于一次分支,每个比特指示“采取”或“未采用”。 表由分支地址索引。 寄存器存储经验法则,并且在出现分支时,其历史从表中取出并用于选择寄存器中的位置,该位置包含该特定分支历史模式的预测。

    Register logging in pipelined computer using register log queue of
register content changes and base queue of register log queue pointers
for respective instructions
    5.
    发明授权
    Register logging in pipelined computer using register log queue of register content changes and base queue of register log queue pointers for respective instructions 失效
    使用寄存器内容更改的寄存器日志队列和注册日志队列指针的基本队列注册流水线计算机

    公开(公告)号:US5450555A

    公开(公告)日:1995-09-12

    申请号:US126094

    申请日:1993-09-23

    IPC分类号: F02B75/02 G06F9/38 G06F9/34

    CPC分类号: G06F9/3848 F02B2075/025

    摘要: A pipelined processor has an instruction unit for decoding instructions and pre-processing operands prior to instruction execution, and an execution unit for executing the decoded instructions. The pre-processing of operands includes changes to general purpose registers, and the changes are recorded in an RLOG queue having read and write pointers. Instruction context for the RLOG queue entries is maintained in a separate RLOG base queue. When decoding begins for a new instruction, the RLOG base queue is loaded with the RLOG write pointer to the first RLOG queue entry that would record a register change for that next instruction. Each time an operand is processed that changes a general purpose register, the value of the change is recorded in the entry pointed to by the RLOG queue write pointer, and the RLOG queue write pointer is advanced. When the execution unit retires an instruction, its entries in the RLOG queue are discarded by advancing the RLOG queue read pointer to the pointer read from the RLOG base queue, and the pointer read from the RLOG base queue is removed from the RLOG base queue. During an unwind process in response to an exception, a micro-control unit successively reads a register change from the RLOG queue, checks whether the RLOG queue is empty, restores the register, and advances the RLOG queue read pointer until the RLOG queue becomes empty, and then resets the RLOG queue and the RLOG base queue.

    摘要翻译: 流水线处理器具有用于在指令执行之前解码指令和预处理操作数的指令单元,以及用于执行解码指令的执行单元。 操作数的预处理包括对通用寄存器的更改,更改记录在具有读和写指针的RLOG队列中。 RLOG队列条目的指令上下文保存在单独的RLOG基队列中。 当新指令的解码开始时,RLOG基本队列将加载RLOG写入指针到第一个RLOG队列条目,该条目将记录该下一个指令的寄存器更改。 每当处理改变通用寄存器的操作数时,更改的值将记录在RLOG队列写入指针指向的条目中,并且RLOG队列写入指针被提前。 当执行单元退出指令时,通过将RLOG队列读取指针推送到从RLOG基本队列读取的指针,RLOG队列中的条目被丢弃,从RLOG基本队列读取的指针从RLOG基本队列中移除。 在响应于异常的退绕过程中,微控制单元从RLOG队列连续读取寄存器改变,检查RLOG队列是否为空,恢复寄存器,并且前进RLOG队列读取指针,直到RLOG队列变空 ,然后重置RLOG队列和RLOG基本队列。

    Pipelined digital CPU with deadlock resolution
    6.
    发明授权
    Pipelined digital CPU with deadlock resolution 失效
    流水线数字CPU,具有死锁分辨率

    公开(公告)号:US5006980A

    公开(公告)日:1991-04-09

    申请号:US222008

    申请日:1988-07-20

    IPC分类号: G06F9/28 G06F9/38

    摘要: A pipelined CPU employs separate microinstruction pipelines for the execution unit and memory management unit. Deadlocks can occur in a pipelined CPU when there is data dependency in two consecutive instructions. The later instruction may stall the pipeline if operands fetched by an earlier instruction are needed, but the earlier instruction is not producing the memory request for the operands because the pipeline is stalled; this results in a deadlock. Using separate micro-pipelines, the earlier instruction is advanced independently of the rest of the pipeline, in the case of a deadlock, so that the operands for the later instruction are provided and the deadlock is broken.

    摘要翻译: 流水线CPU为执行单元和存储器管理单元采用单独的微指令流水线。 在两个连续的指令中存在数据依赖关系时,流水线CPU可能会发生死锁。 如果需要较早的指令提取的操作数,则后一条指令可能会停止流水线,但由于管道停滞,较早的指令不会产生操作数的存储器请求; 这会导致僵局。 使用单独的微管道,在死锁的情况下,先前的指令独立于流水线的其余部分被提前,以便提供后续指令的操作数,并且死锁被破坏。

    Operand specifier processing by grouping similar specifier types
together and providing a general routine for each
    7.
    发明授权
    Operand specifier processing by grouping similar specifier types together and providing a general routine for each 失效
    操作数说明符处理通过将类似的说明符类型分组在一起并为每个提供通用例程

    公开(公告)号:US5500947A

    公开(公告)日:1996-03-19

    申请号:US269991

    申请日:1994-07-01

    CPC分类号: G06F9/34 G06F9/22 G06F9/3016

    摘要: A method of specifying the operands for a microcoded CPU employs a combination of a set of microinstruction routines for generic operand modes, along with hardware primitives for selecting various specific types of operand treatment. Decoding of a machine-level instruction produces an entry point for the microstore, selecting one of the set of generic operand modes. Also, decoding of the instruction produces control bits that are used directly to select the specific operand type or used by the hardware primitives. In this way, branching is avoided in the microinstruction sequences used for operand specifying, but yet the amount of microcode needed is a minimum.

    摘要翻译: 指定微编码CPU的操作数的方法采用一组用于通用操作数模式的微指令例程的组合,以及用于选择各种特定类型的操作数处理的硬件原语。 机器级指令的解码产生微型存储器的入口点,选择一组通用操作数模式之一。 此外,指令的解码产生直接用于选择特定操作数类型或由硬件基元使用的控制位。 以这种方式,在用于操作数指定的微指令序列中避免了分支,但是所需的微代码量是最小的。

    Register conflict scoreboard in pipelined computer using pipelined
reference counts
    8.
    发明授权
    Register conflict scoreboard in pipelined computer using pipelined reference counts 失效
    使用流水线参考计数在流水线计算机中注册冲突记分牌

    公开(公告)号:US5488730A

    公开(公告)日:1996-01-30

    申请号:US934351

    申请日:1992-08-21

    IPC分类号: F02B75/02 G06F9/38

    CPC分类号: G06F9/3848 F02B2075/025

    摘要: A data dependency scoreboard for a pipelined digital computer includes a source counter and a destination counter for each general purpose register (GPR). The source counter for each GPR is incremented each time that a specifier is decoded that specifies the use of the source counter's GPR as a source operand. The source counter is decremented each time that an execution unit reads a source operand from the source counter's GPR. The destination counter is incremented each time that a specifier is decoded that specifies the use of the counter's GPR as a destination operand. The destination counter is decremented each time that the execution unit writes to the destination counter's GPR. A data dependency conflict causing a complex specifier unit to stall occurs when operand processing requires a write to a GPR that has a source counter value greater than zero, and when operand processing requires a read of a GPR that has a destination counter value greater than zero. Source and destination counts from the data dependency scoreboard for a GPR referenced by a complex specifier being processed, for example, are pipelined through down counters in the complex specifier unit, and the counts are updated in the complex specifier unit as the execution unit reads source operands from the GPR and writes to the GPR.

    摘要翻译: 流水线数字计算机的数据依赖记分板包括用于每个通用寄存器(GPR)的源计数器和目的地计数器。 每个GPR的源计数器每次被解码的指定符指定使用源计数器的GPR作为源操作数时递增。 每当执行单元从源计数器的GPR读取源操作数时,源计数器递减。 每当指定使用计数器的GPR作为目标操作数的指定符被解码时,目的地计数器递增。 每当执行单元写入目的地计数器的GPR时,目的地计数器递减。 当操作数处理需要对源计数器值大于零的GPR进行写入时,当操作数处理需要读取目标计数器值大于零的GPR时,导致复杂指定符单元停顿的数据依赖冲突 。 例如,由正在处理的复杂说明符引用的GPR的数据依赖记分板的源和目的地计数通过复合说明符单元中的递减计数器进行流水线化,并且随着执行单元读取源,计数在复杂说明符单元中更新 来自GPR的操作数并写入GPR。