Qualifying Software Branch-Target Hints with Hardware-Based Predictions
    3.
    发明申请
    Qualifying Software Branch-Target Hints with Hardware-Based Predictions 审中-公开
    合格软件分支 - 基于硬件预测的目标提示

    公开(公告)号:US20140006752A1

    公开(公告)日:2014-01-02

    申请号:US13534649

    申请日:2012-06-27

    IPC分类号: G06F9/40

    摘要: A processor architecture to qualify software target-branch hints with hardware-based predictions, the processor including a branch target address cache having entries, where an entry includes a tag field to store an instruction address, a target field to store a target address, and a state field to store a state value. Upon decoding an indirect branch instruction, the processor determines whether an entry in the branch target address cache has an instruction address that matches the address of the decoded indirect branch instruction; and if there is a match, depending upon the state value stored in the entry, the processor will use the stored target address as the predicted target address for the decoded indirect branch instruction, or will use a software provided target address hint if available.

    摘要翻译: 一种处理器架构,用于基于硬件预测来限定软件目标分支提示,所述处理器包括具有条目的分支目标地址高速缓存,其中条目包括用于存储指令地址的标签字段,存储目标地址的目标字段,以及 状态字段来存储状态值。 在解码间接分支指令时,处理器确定分支目标地址高速缓存中的条目是否具有与解码的间接分支指令的地址相匹配的指令地址; 并且如果存在匹配,则根据存储在条目中的状态值,处理器将使用所存储的目标地址作为解码的间接分支指令的预测目标地址,或者将使用提供的软件提供的目标地址提示(如果可用)。

    Translation lookaside buffer (TLB) suppression for intra-page program counter relative or absolute address branch instructions
    6.
    发明授权
    Translation lookaside buffer (TLB) suppression for intra-page program counter relative or absolute address branch instructions 有权
    翻译后备缓冲器(TLB)抑制用于页内程序计数器相对或绝对地址分支指令

    公开(公告)号:US07406613B2

    公开(公告)日:2008-07-29

    申请号:US11003772

    申请日:2004-12-02

    IPC分类号: G06F1/26

    摘要: In a pipelined processor, a pre-decoder in advance of an instruction cache calculates the branch target address (BTA) of PC-relative and absolute address branch instructions. The pre-decoder compares the BTA with the branch instruction address (BIA) to determine whether the target and instruction are in the same memory page. A branch target same page (BTSP) bit indicating this is written to the cache and associated with the instruction. When the branch is executed and evaluated as taken, a TLB access to check permission attributes for the BTA is suppressed if the BTA is in the same page as the BIA, as indicated by the BTSP bit. This reduces power consumption as the TLB access is suppressed and the BTA/BIA comparison is only performed once, when the branch instruction is first fetched. Additionally, the pre-decoder removes the BTA/BIA comparison from the BTA generation and selection critical path.

    摘要翻译: 在流水线处理器中,在指令高速缓存之前的预解码器计算PC相对的分支目标地址(BTA)和绝对地址分支指令。 预解码器将BTA与分支指令地址(BIA)进行比较,以确定目标和指令是否在相同的存储器页面中。 指示这一点的分支目标相同页(BTSP)位被写入高速缓存并与指令相关联。 当分支被执行并被评估时,如果BTA与BIA在同一个页面中,如BTSP位所指示的那样,则抑制对BTA的许可属性的TLB访问被抑制。 当首先取出分支指令时,这样可以降低TLB访问的功耗,并且仅执行一次BTA / BIA比较。 另外,预解码器从BTA生成和选择关键路径去除BTA / BIA比较。

    Predecode repair cache for instructions that cross an instruction cache line
    7.
    发明授权
    Predecode repair cache for instructions that cross an instruction cache line 有权
    Predecode修复缓存,用于跨越指令高速缓存行的指令

    公开(公告)号:US08898437B2

    公开(公告)日:2014-11-25

    申请号:US11934108

    申请日:2007-11-02

    IPC分类号: G06F9/30 G06F9/38

    摘要: A predecode repair cache is described in a processor capable of fetching and executing variable length instructions having instructions of at least two lengths which may be mixed in a program. An instruction cache is operable to store in an instruction cache line instructions having at least a first length and a second length, the second length longer than the first length. A predecoder is operable to predecode instructions fetched from the instruction cache that have invalid predecode information to form repaired predecode information. A predecode repair cache is operable to store the repaired predecode information associated with instructions of the second length that span across two cache lines in the instruction cache. Methods for filling the predecode repair cache and for executing an instruction that spans across two cache lines are also described.

    摘要翻译: 在能够获取和执行具有至少两个长度的指令的可变长度指令的处理器中描述了预代码修复高速缓存,其可以在程序中混合。 指令高速缓存用于存储指令高速缓存行指令,该指令具有至少第一长度和第二长度,第二长度长于第一长度。 预解码器可用于对具有无效预解码信息的指令高速缓存取出的指令进行预解码,以形成修复的预解码信息。 预解码修复高速缓存可操作用于存储与跨越指令高速缓存中的两个高速缓存行的第二长度的指令相关联的修复的预解码信息。 还描述了用于填充预解码修复高速缓存和用于执行跨越两个高速缓存行的指令的方法。

    Sliding-window, block-based branch target address cache
    8.
    发明授权
    Sliding-window, block-based branch target address cache 有权
    滑动窗口,基于块的分支目标地址缓存

    公开(公告)号:US07827392B2

    公开(公告)日:2010-11-02

    申请号:US11422186

    申请日:2006-06-05

    IPC分类号: G06F9/00

    摘要: A sliding-window, block-based Branch Target Address Cache (BTAC) comprises a plurality of entries, each entry associated with a block of instructions containing at least one branch instruction having been evaluated taken, and having a tag associated with the address of the first instruction in the block. The blocks each correspond to a group of instructions fetched from memory, such as an I-cache. Where a branch instruction is included in two or more fetch groups, it is also included in two or more instruction blocks associated with BTAC entries. The sliding-window, block-based BTAC allows for storing the Branch Target Address (BTA) of two or more taken branch instructions that fall in the same instruction block, without providing for multiple BTA storage space in each BTAC entry, by storing BTAC entries associated with different instruction blocks, each containing at least one of the taken branch instructions.

    摘要翻译: 滑动窗口,基于块的分支目标地址高速缓存(BTAC)包括多个条目,每个条目与包含已被评估的至少一个分支指令的指令块相关联,并且具有与该地址相关联的标签 第一个指令在块中。 这些块各自对应于从存储器获取的一组指令,例如I缓存。 在两个或更多个取出组中包含分支指令的情况下,还包括在与BTAC条目相关联的两个或多个指令块中。 滑动窗口,基于块的BTAC允许存储落在同一指令块中的两个或更多个采取的分支指令的分支目标地址(BTA),而不需要在每个BTAC条目中提供多个BTA存储空间,通过存储BTAC条目 与不同的指令块相关联,每个指令块包含至少一个采取的分支指令。

    Pre-decode error handling via branch correction
    10.
    发明授权
    Pre-decode error handling via branch correction 有权
    通过分支校正预解码错误处理

    公开(公告)号:US07415638B2

    公开(公告)日:2008-08-19

    申请号:US10995858

    申请日:2004-11-22

    IPC分类号: G06F11/00 G06F9/30

    摘要: In a pipelined processor where instructions are pre-decoded prior to being stored in a cache, an incorrectly pre-decoded instruction is detected during execution in the pipeline. The corresponding instruction is invalidated in the cache, and the instruction is forced to evaluate as a branch instruction. In particular, the branch instruction is evaluated as “mispredicted not taken” with a branch target address of the incorrectly pre-decoded instruction's address. This, with the invalidated cache line, causes the incorrectly pre-decoded instruction to be re-fetched from memory with a precise address. The re-fetched instruction is then correctly pre-decoded, written to the cache, and executed.

    摘要翻译: 在流水线处理器中,在将存储在高速缓存中的指令进行预解码之前,在流水线执行期间检测到未正确预解码的指令。 相应的指令在缓存中无效,并且强制将该指令作为分支指令进行求值。 特别地,分支指令被评估为未被错误地预解码的指令的地址的分支目标地址“未被采用”。 这使得无效的高速缓存行导致错误地预解码的指令从具有精确地址的存储器重新获取。 然后重新获取的指令被正确预解码,写入高速缓存并执行。