Reservation stations to increase instruction level parallelism
    1.
    发明授权
    Reservation stations to increase instruction level parallelism 有权
    预约站增加指令级并行性

    公开(公告)号:US06742111B2

    公开(公告)日:2004-05-25

    申请号:US09144302

    申请日:1998-08-31

    申请人: Naresh H. Soni

    发明人: Naresh H. Soni

    IPC分类号: G06F906

    摘要: A data processing system having a distributed reservation station is provided which stores basic blocks of code in the form of microprocessor instructions. The present invention is capable of distributing basic blocks of code to the various distributed reservation stations. Due to the smaller number of entries in the distributed reservation stations, the look up time required to find a particular instruction is much less than in a centralized reservation station. Additional instruction level parallelism is achieved by maintaining single basic blocks of code in the distributed reservation stations. With a distributed reservation station, an independent scheduler can be used for each one of the distributed reservation stations. When the instruction is ready for execution, the scheduler will remove that instruction from the distributed reservation station and queue that instruction(s) for immediate execution at the particular execution unit. Multiple independent schedulers will provide greater efficiency when compared to a single scheduler which must contend with approximately 20-24 instructions that have increased dependency on one another.

    摘要翻译: 提供具有分布式预留站的数据处理系统,其以微处理器指令的形式存储基本的代码块。 本发明能够将基本的代码块分配给各种分布式预留站。 由于分布式预留站中的条目数量较少,所以查找特定指令所需的查找时间远小于集中式预留站。 通过在分布式保留站中维护单个基本代码块来实现附加指令级并行性。 对于分布式预留站,可以为分布式预留站中的每一个使用独立的调度器。 当指令准备执行时,调度程序将从分布式预留站中删除该指令,并排队该指令,以便在特定执行单元立即执行。 与单个调度器相比,多个独立的调度程序将提供更高的效率,该调度程序必须与大约20-24个指令之间的相互依赖性增加。

    System and method of saving and restoring registers in a data processing system
    2.
    发明授权
    System and method of saving and restoring registers in a data processing system 失效
    在数据处理系统中保存和恢复寄存器的系统和方法

    公开(公告)号:US06671762B1

    公开(公告)日:2003-12-30

    申请号:US08999298

    申请日:1997-12-29

    IPC分类号: G06F1332

    CPC分类号: G06F9/462

    摘要: A system and method is provided to reduce the latency associated with saving and restoring the state of the floating point registers in a microprocessor when switching tasks between floating point and MMX operations, or between tasks within the same context. The present invention maintains a secondary register file along with the primary floating point register file in the CPU. The primary register will keep the state of the floating point task “as is” upon the occurrence of a task switch to MMX, or another context. The address of the area where the FPU state is saved is maintained in a save area address register. The secondary register is then utilized by the other context to store intermediate results of executed instructions. In the majority of cases when a context switch back to floating point operations occurs, the previous state is restored from the primary register without incurring the latency of retrieving the instructions and data from the memory subsystem. In addition to the secondary register, a snooping mechanism will use the address of the state save area to determine if the state save area was modified. If the state save area is modified, then the floating point state must be restored from the memory subsystem in a conventional manner. However, the floating point save area will seldom be modified and the penalty for maintaining the floating point state in the CPU is negligible. Further, the present invention will allow the microprocessor to operate in a compatible manner with current operating systems and application software.

    摘要翻译: 提供了一种系统和方法,用于在切换浮点和MMX操作之间的任务时,或在相同上下文中的任务之间,减少与微处理器中的浮点寄存器状态的保存和恢复相关的延迟。 本发明在CPU中保持辅助寄存器文件以及主浮点寄存器文件。 在发生任务切换到MMX或其他上下文时,主寄存器将保持浮点任务的状态“按原样”。 保存FPU状态的区域的地址保存在保存区域地址寄存器中。 然后由另一上下文利用辅助寄存器来存储已执行指令的中间结果。 在上下文切换回浮点操作的大多数情况下,先前状态从主寄存器恢复,而不会导致从存储器子系统检索指令和数据的延迟。 除二级寄存器之外,侦听机制将使用状态保存区域的地址来确定状态保存区域是否被修改。 如果状态保存区域被修改,则必须以常规方式从存储器子系统恢复浮点状态。 但是,浮点保存区域很少被修改,并且在CPU中保持浮点状态的惩罚是可忽略的。 此外,本发明将允许微处理器以与当前操作系统和应用软件兼容的方式操作。

    Virtual condition codes
    3.
    发明授权
    Virtual condition codes 有权
    虚拟条件码

    公开(公告)号:US06684323B2

    公开(公告)日:2004-01-27

    申请号:US09179783

    申请日:1998-10-27

    申请人: Naresh H. Soni

    发明人: Naresh H. Soni

    IPC分类号: G06F906

    摘要: The present invention utilizes a “virtual” condition code (VCC) which can control the instruction sequence in a microprocessor. The virtual condition code is stored in an internal, non-architected register that is not visible to the programmer, but is used by various microprocessor instructions to determine when a branch is to be taken. For example, the virtual condition code can be used as a condition for branching out of a series of repetitive instructions. The virtual condition code (VCC) can eliminate a portion of the processing overhead used when determining whether a sequential number, such as a count value in a register associated with a repetitive instruction, e.g. a LOOP, is zero. In accordance with one aspect of the present invention, a LOOP instruction will decrement a count value in a register (to maintain compatibility with the ISA). However, a corresponding branch instruction will use the virtual condition code, rather than checking the contents of the entire register, to determine whether or not to branch. In this manner, the present invention improves performance by minimizing the amount of hardware resources (i.e. compare logic) utilized while maintaining compatibility with the Intel architecture since the programmer visible condition code is not used. By leaving the programmer visible condition codes unchanged, the software is not forced to save and restore the register contents during each iteration.

    摘要翻译: 本发明利用可以控制微处理器中的指令序列的“虚拟”条件码(VCC)。 虚拟条件代码存储在程序员不可见的内部非架构寄存器中,但由各种微处理器指令用于确定何时采用分支。 例如,虚拟条件代码可以用作从一系列重复指令中分支出来的条件。 虚拟条件码(VCC)可以在确定诸如与重复指令相关联的寄存器中的计数值(例如,重复指令)中的计数值是否等顺序号时,消除所使用的处理开销的一部分。 一个LOOP,为零。 根据本发明的一个方面,LOOP指令将递减寄存器中的计数值(以保持与ISA的兼容性)。 但是,相应的分支指令将使用虚拟条件代码,而不是检查整个寄存器的内容,以确定是否分支。 以这种方式,本发明通过最小化所使用的硬件资源(即比较逻辑)的数量来提高性能,同时维持与Intel架构的兼容性,因为不使用编程器可视条件代码。 通过使程序员可见条件代码不变,软件在每次迭代期间不被强制保存和恢复寄存器内容。