OPTIMIZING PERFORMANCE FOR CONTEXT-DEPENDENT INSTRUCTIONS
    1.
    发明申请
    OPTIMIZING PERFORMANCE FOR CONTEXT-DEPENDENT INSTRUCTIONS 有权
    优化性能的背景相关指示

    公开(公告)号:US20140281405A1

    公开(公告)日:2014-09-18

    申请号:US13841576

    申请日:2013-03-15

    CPC classification number: G06F9/30098 G06F9/30189 G06F9/3842 G06F9/3863

    Abstract: A processor includes a queue for storing instructions processed within the context of a current value of a register field, where for some embodiments the instruction is undefined or defined, depending upon the register field at time of processing. After a write instruction (an instruction that writes to the register field) executes, the queue is searched for any entries that contain instructions that depend upon the executed write instruction. Each such entry stores the value of the register field at the time the instruction in the entry was processed. If such an entry is found in the queue and its stored value of the register field does not match the value that the write instruction wrote to the register field, then the processor flushes the pipeline and restarts at a state so as to correctly execute the instruction.

    Abstract translation: 处理器包括用于存储在寄存器字段的当前值的上下文中处理的指令的队列,其中对于一些实施例,取决于处理时的寄存器字段,指令是未定义的或定义的。 在执行写入指令(写入寄存器字段的指令)之后,将搜索包含依赖于执行的写入指令的指令的任何条目。 每个这样的条目存储处理条目中的指令时的寄存器字段的值。 如果在队列中找到这样的条目,并且其寄存器字段的存储值与写入指令写入寄存器字段的值不匹配,则处理器刷新流水线并在一个状态下重新启动,以便正确地执行指令 。

    PROVIDING EARLY PIPELINE OPTIMIZATION OF CONDITIONAL INSTRUCTIONS IN PROCESSOR-BASED SYSTEMS

    公开(公告)号:US20190294443A1

    公开(公告)日:2019-09-26

    申请号:US15926429

    申请日:2018-03-20

    Abstract: Providing early pipeline optimization of conditional instructions in processor-based systems is disclosed. In one aspect, an instruction pipeline of a processor-based system detects a mispredicted branch (i.e., following a misprediction of a condition associated with a speculatively executed conditional branch instruction), and records a current state of one or more condition flags as a condition flags snapshot. After a pipeline flush is initiated and a corrected fetch path is restarted, an instruction decode stage of the instruction pipeline uses the condition flags snapshot to apply optimizations to conditional instructions detected within the corrected fetch path. According to some aspects, the condition flags snapshot is subsequently invalidated upon encountering a condition-flag-writing instruction within the corrected fetch path. In this manner, the condition flags snapshot enables non-speculative (with respect to the corrected fetch path) resolution of conditional instructions earlier within the instruction pipeline, thus conserving system resources and improving processor performance.

    Method to improve speed of executing return branch instructions in a processor
    3.
    发明授权
    Method to improve speed of executing return branch instructions in a processor 有权
    提高处理器中返回分支指令执行速度的方法

    公开(公告)号:US09411590B2

    公开(公告)日:2016-08-09

    申请号:US13833844

    申请日:2013-03-15

    CPC classification number: G06F9/30058 G06F9/30054 G06F9/3806

    Abstract: An apparatus and method for executing call branch and return branch instructions in a processor by utilizing a link register stack. The processor includes a branch counter that is initialized to zero, and is set to zero each time the processor decodes a link register manipulating instruction other than a call branch instruction. The branch counter is incremented by one each time a call branch instruction is decoded and an address is pushed onto the link register stack. In response to decoding a return branch instruction and provided the branch counter is not zero, a target address for the decoded return branch instruction is popped off the link register stack, the branch counter is decremented, and there is no need to check the target address for correctness.

    Abstract translation: 一种用于通过利用链路寄存器堆栈在处理器中执行呼叫分支和返回分支指令的装置和方法。 处理器包括初始化为零的分支计数器,并且每当处理器解码除了呼叫转移指令之外的链接寄存器操作指令时,该分支计数器被设置为零。 每当一个呼叫转移指令被解码并且一个地址被推到链路寄存器堆栈上时,分支计数器递增1。 响应于解码返回分支指令并且提供的分支计数器不为零,解码的返回分支指令的目标地址从链接寄存器堆栈中弹出,分支计数器递减,并且不需要检查目标地址 为正确。

    Establishing a branch target instruction cache (BTIC) entry for subroutine returns to reduce execution pipeline bubbles, and related systems, methods, and computer-readable media
    4.
    发明授权
    Establishing a branch target instruction cache (BTIC) entry for subroutine returns to reduce execution pipeline bubbles, and related systems, methods, and computer-readable media 有权
    建立用于子程序的分支目标指令缓存(BTIC)条目返回以减少执行管道气泡,以及相关的系统,方法和计算机可读介质

    公开(公告)号:US09317293B2

    公开(公告)日:2016-04-19

    申请号:US13792335

    申请日:2013-03-11

    CPC classification number: G06F9/3808 G06F9/30054

    Abstract: Establishing a branch target instruction cache (BTIC) entry for subroutine returns to reduce pipeline bubbles, and related systems, methods, and computer-readable media are disclosed. In one embodiment, a method of establishing a BTIC entry includes detecting a subroutine call in an execution pipeline. In response, at least one instruction fetched sequential to the subroutine call is written as a branch target instruction in a BTIC entry for a subroutine return. A next instruction fetch address is calculated, and is written into a next instruction fetch address field in the BTIC entry. In this manner, the BTIC may provide correct branch target instruction and next instruction fetch address data for the subroutine return, even if the subroutine return is encountered for the first time or the subroutine is called from different calling locations.

    Abstract translation: 建立用于子程序的分支目标指令缓存(BTIC)条目返回以减少管道气泡,以及相关系统,方法和计算机可读介质。 在一个实施例中,建立BTIC条目的方法包括检测执行流水线中的子程序调用。 作为响应,在子程序返回的BTIC条目中写入与子程序调用顺序取得的至少一个指令作为分支目标指令。 计算下一个指令提取地址,并将其写入BTIC条目中的下一个指令获取地址字段。 以这种方式,即使第一次遇到子程序返回或从不同的呼叫位置调用子程序,BTIC可以为子程序返回提供正确的分支目标指令和下一个指令获取地址数据。

    ELIMINATING REDUNDANT SYNCHRONIZATION BARRIERS IN INSTRUCTION PROCESSING CIRCUITS, AND RELATED PROCESSOR SYSTEMS, METHODS, AND COMPUTER-READABLE MEDIA
    6.
    发明申请
    ELIMINATING REDUNDANT SYNCHRONIZATION BARRIERS IN INSTRUCTION PROCESSING CIRCUITS, AND RELATED PROCESSOR SYSTEMS, METHODS, AND COMPUTER-READABLE MEDIA 审中-公开
    消除指令处理电路中的冗余同步障碍,以及相关处理器系统,方法和计算机可读介质

    公开(公告)号:US20140281429A1

    公开(公告)日:2014-09-18

    申请号:US13829315

    申请日:2013-03-14

    CPC classification number: G06F9/30087

    Abstract: Embodiments disclosed herein include eliminating redundant synchronization barriers from execution pipelines in instruction processing circuits. Related processor systems, methods, and computer-readable media are also disclosed. By tracking the occurrence of synchronization events, unnecessary software synchronization operations may be identified and eliminated, thus improving performance of a central processing unit (CPU). In one embodiment, a method for eliminating redundant synchronization barriers in an instruction stream is provided. The method comprises determining whether a next instruction comprises a synchronization barrier of a type corresponding to a first synchronization event. The method also comprises eliminating the next instruction from the instruction stream, responsive to determining that the next instruction comprises a synchronization barrier of a type corresponding to the first synchronization event. In this manner, the average number of instructions executed during each CPU clock cycle may be increased by avoiding unnecessary synchronization operations.

    Abstract translation: 本文公开的实施例包括从指令处理电路中的执行管线消除冗余同步障碍。 还公开了相关处理器系统,方法和计算机可读介质。 通过跟踪同步事件的发生,可以识别和消除不必要的软件同步操作,从而提高中央处理单元(CPU)的性能。 在一个实施例中,提供了用于消除指令流中的冗余同步障碍的方法。 该方法包括确定下一条指令是否包括与第一同步事件相对应的类型的同步屏障。 响应于确定下一条指令包括与第一同步事件对应的类型的同步屏障,该方法还包括从指令流中消除下一条指令。 以这种方式,可以通过避免不必要的同步操作来增加在每个CPU时钟周期期间执行的平均指令数。

    ESTABLISHING A BRANCH TARGET INSTRUCTION CACHE (BTIC) ENTRY FOR SUBROUTINE RETURNS TO REDUCE EXECUTION PIPELINE BUBBLES, AND RELATED SYSTEMS, METHODS, AND COMPUTER-READABLE MEDIA
    7.
    发明申请
    ESTABLISHING A BRANCH TARGET INSTRUCTION CACHE (BTIC) ENTRY FOR SUBROUTINE RETURNS TO REDUCE EXECUTION PIPELINE BUBBLES, AND RELATED SYSTEMS, METHODS, AND COMPUTER-READABLE MEDIA 有权
    建立分支目标指导缓存(BTIC)进入SUBRONTINE返回以减少执行管道泡沫以及相关系统,方法和计算机可读介质

    公开(公告)号:US20140149726A1

    公开(公告)日:2014-05-29

    申请号:US13792335

    申请日:2013-03-11

    CPC classification number: G06F9/3808 G06F9/30054

    Abstract: Establishing a branch target instruction cache (BTIC) entry for subroutine returns to reduce pipeline bubbles, and related systems, methods, and computer-readable media are disclosed. In one embodiment, a method of establishing a BTIC entry includes detecting a subroutine call in an execution pipeline. In response, at least one instruction fetched sequential to the subroutine call is written as a branch target instruction in a BTIC entry for a subroutine return. A next instruction fetch address is calculated, and is written into a next instruction fetch address field in the BTIC entry. In this manner, the BTIC may provide correct branch target instruction and next instruction fetch address data for the subroutine return, even if the subroutine return is encountered for the first time or the subroutine is called from different calling locations.

    Abstract translation: 建立用于子程序的分支目标指令缓存(BTIC)条目返回以减少管道气泡,以及相关系统,方法和计算机可读介质。 在一个实施例中,建立BTIC条目的方法包括检测执行流水线中的子程序调用。 作为响应,在子程序返回的BTIC条目中写入与子程序调用顺序取得的至少一个指令作为分支目标指令。 计算下一个指令提取地址,并将其写入BTIC条目中的下一个指令获取地址字段。 以这种方式,即使第一次遇到子程序返回或从不同的呼叫位置调用子程序,BTIC可以为子程序返回提供正确的分支目标指令和下一个指令获取地址数据。

    METHOD TO IMPROVE SPEED OF EXECUTING RETURN BRANCH INSTRUCTIONS IN A PROCESSOR
    8.
    发明申请
    METHOD TO IMPROVE SPEED OF EXECUTING RETURN BRANCH INSTRUCTIONS IN A PROCESSOR 有权
    在处理器中提高执行返回分支指令速度的方法

    公开(公告)号:US20140281394A1

    公开(公告)日:2014-09-18

    申请号:US13833844

    申请日:2013-03-15

    CPC classification number: G06F9/30058 G06F9/30054 G06F9/3806

    Abstract: An apparatus and method for executing call branch and return branch instructions in a processor by utilizing a link register stack. The processor includes a branch counter that is initialized to zero, and is set to zero each time the processor decodes a link register manipulating instruction other than a call branch instruction. The branch counter is incremented by one each time a call branch instruction is decoded and an address is pushed onto the link register stack. In response to decoding a return branch instruction and provided the branch counter is not zero, a target address for the decoded return branch instruction is popped off the link register stack, the branch counter is decremented, and there is no need to check the target address for correctness.

    Abstract translation: 一种用于通过利用链路寄存器堆栈在处理器中执行呼叫分支和返回分支指令的装置和方法。 处理器包括初始化为零的分支计数器,并且每当处理器解码除了呼叫分支指令之外的链接寄存器操作指令时,该分支计数器被设置为零。 每当一个呼叫转移指令被解码并且一个地址被推到链路寄存器堆栈上时,分支计数器递增1。 响应于解码返回分支指令并且提供的分支计数器不为零,解码的返回分支指令的目标地址从链接寄存器堆栈中弹出,分支计数器递减,并且不需要检查目标地址 为正确。

Patent Agency Ranking