PROVIDING LATE PHYSICAL REGISTER ALLOCATION AND EARLY PHYSICAL REGISTER RELEASE IN OUT-OF-ORDER PROCESSOR (OOP)-BASED DEVICES IMPLEMENTING A CHECKPOINT-BASED ARCHITECTURE

    公开(公告)号:US20200097296A1

    公开(公告)日:2020-03-26

    申请号:US16138011

    申请日:2018-09-21

    Abstract: Providing late physical register allocation and early physical register release in out-of-order processor (OOP)-based devices implementing a checkpoint-based architecture is provided. In this regard, an OOP-based device provides a register management circuit that is configured to employ a combination of the checkpoint approach and the virtual register approach. The register management circuit includes a most recent table (MRT) for tracking mappings of logical register numbers (LRNs) to physical register numbers (PRNs), a physical register file (PRF) storing information for physical registers, a virtual register file (VRF) storing data for virtual registers, and a checkpoint queue for tracking active checkpoints (each of which is a snapshot of the MRT at a given time). The register management circuit applies checkpoint selection criteria for balancing the number of checkpoints, and implements late physical register allocation using virtual registers to provide an effectively larger physical register file and checkpoint-based early release of physical registers.

    STORING NARROW PRODUCED VALUES FOR INSTRUCTION OPERANDS DIRECTLY IN A REGISTER MAP IN AN OUT-OF-ORDER PROCESSOR
    4.
    发明申请
    STORING NARROW PRODUCED VALUES FOR INSTRUCTION OPERANDS DIRECTLY IN A REGISTER MAP IN AN OUT-OF-ORDER PROCESSOR 审中-公开
    在非订单处理者的注册地图中直接存储指令操作的生成值

    公开(公告)号:US20170046154A1

    公开(公告)日:2017-02-16

    申请号:US14860032

    申请日:2015-09-21

    CPC classification number: G06F9/30112 G06F9/3838 G06F9/384 G06F9/3857

    Abstract: Storing narrow produced values for instruction operands directly in a register map in an out-of-order processor (OoP) is provided. An OoP is provided that includes an instruction processing system. The instruction processing system includes a number of instruction processing stages configured to pipeline the processing and execution of instructions according to a dataflow execution. The instruction processing system also includes a register map table (RMT) configured to store address pointers mapping logical registers to physical registers in a physical register file (PRF) for storing produced data for use by consumer instructions without overwriting logical registers for later executed, out-of-order instructions. In certain aspects, the instruction processing system is configured to write back (i.e., store) narrow values produced by executed instructions directly into the RMT, as opposed to writing the narrow produced values into the PRF in a write back stage.

    Abstract translation: 提供将指令操作数的窄生成值直接存储在乱序处理器(OoP)的寄存器映射中。 提供了包括指令处理系统的OoP。 指令处理系统包括多个指令处理阶段,其被配置为根据数据流执行流水线处理和执行指令。 指令处理系统还包括寄存器映射表(RMT),其被配置为存储将逻辑寄存器映射到物理寄存器文件(PRF)中的物理寄存器的地址指针,用于存储由消费者指令使用的产生数据,而不覆盖用于稍后执行的逻辑寄存器 订单说明。 在某些方面,指令处理系统被配置为将由执行的指令产生的窄值直接写入(即存储)到RMT中,而不是在写回阶段将窄的产生值写入PRF。

    Fusing Immediate Value, Write-Based Instructions in Instruction Processing Circuits, and Related Processor Systems, Methods, and Computer-Readable Media
    6.
    发明申请
    Fusing Immediate Value, Write-Based Instructions in Instruction Processing Circuits, and Related Processor Systems, Methods, and Computer-Readable Media 有权
    指令处理电路中的立即值,基于写入的指令,以及相关处理器系统,方法和计算机可读介质

    公开(公告)号:US20140149722A1

    公开(公告)日:2014-05-29

    申请号:US13686229

    申请日:2012-11-27

    CPC classification number: G06F9/3017 G06F9/30167

    Abstract: Fusing immediate value, write-based instructions in instruction processing circuits, and related processor systems, methods, and computer-readable media are disclosed. In one embodiment, a first instruction indicating an operation writing an immediate value to a register is detected by an instruction processing circuit. The circuit also detects at least one subsequent instruction indicating an operation that overwrites at least one first portion of the register while maintaining a value of a second portion of the register. The at least one subsequent instruction is converted (or replaced) with a fused instruction(s), which indicates an operation writing the at least one first portion and the second portion of the register. In this manner, conversion of multiple instructions for generating a constant into the fused instruction(s) removes the potential for a read-after-write hazard and associated consequences caused by dependencies between certain instructions, while reducing a number of clock cycles required to process the instructions.

    Abstract translation: 公开了立即值的融合,指令处理电路中的基于写入的指令以及相关的处理器系统,方法和计算机可读介质。 在一个实施例中,指令处理电路检测指示向寄存器写入立即值的操作的第一指令。 电路还检测至少一个后续指令,指示在保持寄存器的第二部分的值的同时重写寄存器的至少一个第一部分的操作。 所述至少一个后续指令被转换(或替代)与一个融合指令,其指示写入寄存器的至少一个第一部分和第二部分的操作。 以这种方式,将用于产生常数的多个指令转换为融合指令消除了读写后危险和由特定指令之间的依赖性引起的相关后果的可能性,同时减少了处理所需的时钟周期数 说明。

    METHOD TO IMPROVE SPEED OF EXECUTING RETURN BRANCH INSTRUCTIONS IN A PROCESSOR
    8.
    发明申请
    METHOD TO IMPROVE SPEED OF EXECUTING RETURN BRANCH INSTRUCTIONS IN A PROCESSOR 有权
    在处理器中提高执行返回分支指令速度的方法

    公开(公告)号:US20140281394A1

    公开(公告)日:2014-09-18

    申请号:US13833844

    申请日:2013-03-15

    CPC classification number: G06F9/30058 G06F9/30054 G06F9/3806

    Abstract: An apparatus and method for executing call branch and return branch instructions in a processor by utilizing a link register stack. The processor includes a branch counter that is initialized to zero, and is set to zero each time the processor decodes a link register manipulating instruction other than a call branch instruction. The branch counter is incremented by one each time a call branch instruction is decoded and an address is pushed onto the link register stack. In response to decoding a return branch instruction and provided the branch counter is not zero, a target address for the decoded return branch instruction is popped off the link register stack, the branch counter is decremented, and there is no need to check the target address for correctness.

    Abstract translation: 一种用于通过利用链路寄存器堆栈在处理器中执行呼叫分支和返回分支指令的装置和方法。 处理器包括初始化为零的分支计数器,并且每当处理器解码除了呼叫分支指令之外的链接寄存器操作指令时,该分支计数器被设置为零。 每当一个呼叫转移指令被解码并且一个地址被推到链路寄存器堆栈上时,分支计数器递增1。 响应于解码返回分支指令并且提供的分支计数器不为零,解码的返回分支指令的目标地址从链接寄存器堆栈中弹出,分支计数器递减,并且不需要检查目标地址 为正确。

    Fusing immediate value, write-based instructions in instruction processing circuits, and related processor systems, methods, and computer-readable media
    9.
    发明授权
    Fusing immediate value, write-based instructions in instruction processing circuits, and related processor systems, methods, and computer-readable media 有权
    在指令处理电路中融合即时价值,基于写入的指令,以及相关的处理器系统,方法和计算机可读介质

    公开(公告)号:US09477476B2

    公开(公告)日:2016-10-25

    申请号:US13686229

    申请日:2012-11-27

    CPC classification number: G06F9/3017 G06F9/30167

    Abstract: Fusing immediate value, write-based instructions in instruction processing circuits, and related processor systems, methods, and computer-readable media are disclosed. In one embodiment, a first instruction indicating an operation writing an immediate value to a register is detected by an instruction processing circuit. The circuit also detects at least one subsequent instruction indicating an operation that overwrites at least one first portion of the register while maintaining a value of a second portion of the register. The at least one subsequent instruction is converted (or replaced) with a fused instruction(s), which indicates an operation writing the at least one first portion and the second portion of the register. In this manner, conversion of multiple instructions for generating a constant into the fused instruction(s) removes the potential for a read-after-write hazard and associated consequences caused by dependencies between certain instructions, while reducing a number of clock cycles required to process the instructions.

    Abstract translation: 公开了立即值的融合,指令处理电路中的基于写入的指令以及相关的处理器系统,方法和计算机可读介质。 在一个实施例中,指令处理电路检测指示向寄存器写入立即值的操作的第一指令。 电路还检测至少一个后续指令,指示在保持寄存器的第二部分的值的同时重写寄存器的至少一个第一部分的操作。 所述至少一个后续指令被转换(或替代)与一个融合指令,其指示写入寄存器的至少一个第一部分和第二部分的操作。 以这种方式,将用于产生常数的多个指令转换为融合指令消除了读写后危险和由特定指令之间的依赖性引起的相关后果的可能性,同时减少了处理所需的时钟周期数 说明。

    OPTIMIZING PERFORMANCE FOR CONTEXT-DEPENDENT INSTRUCTIONS
    10.
    发明申请
    OPTIMIZING PERFORMANCE FOR CONTEXT-DEPENDENT INSTRUCTIONS 有权
    优化性能的背景相关指示

    公开(公告)号:US20140281405A1

    公开(公告)日:2014-09-18

    申请号:US13841576

    申请日:2013-03-15

    CPC classification number: G06F9/30098 G06F9/30189 G06F9/3842 G06F9/3863

    Abstract: A processor includes a queue for storing instructions processed within the context of a current value of a register field, where for some embodiments the instruction is undefined or defined, depending upon the register field at time of processing. After a write instruction (an instruction that writes to the register field) executes, the queue is searched for any entries that contain instructions that depend upon the executed write instruction. Each such entry stores the value of the register field at the time the instruction in the entry was processed. If such an entry is found in the queue and its stored value of the register field does not match the value that the write instruction wrote to the register field, then the processor flushes the pipeline and restarts at a state so as to correctly execute the instruction.

    Abstract translation: 处理器包括用于存储在寄存器字段的当前值的上下文中处理的指令的队列,其中对于一些实施例,取决于处理时的寄存器字段,指令是未定义的或定义的。 在执行写入指令(写入寄存器字段的指令)之后,将搜索包含依赖于执行的写入指令的指令的任何条目。 每个这样的条目存储处理条目中的指令时的寄存器字段的值。 如果在队列中找到这样的条目,并且其寄存器字段的存储值与写入指令写入寄存器字段的值不匹配,则处理器刷新流水线并在一个状态下重新启动,以便正确地执行指令 。

Patent Agency Ranking