Method and apparatus for prefetching non-sequential instruction addresses
    1.
    发明授权
    Method and apparatus for prefetching non-sequential instruction addresses 有权
    用于预取非顺序指令地址的方法和装置

    公开(公告)号:US07917731B2

    公开(公告)日:2011-03-29

    申请号:US11461883

    申请日:2006-08-02

    IPC分类号: G06F9/32

    CPC分类号: G06F9/3804 G06F9/3806

    摘要: A processor performs a prefetch operation on non-sequential instruction addresses. If a first instruction address misses in an instruction cache and accesses a higher-order memory as part of a fetch operation, and a branch instruction associated with the first instruction address or an address following the first instruction address is detected and predicted taken, a prefetch operation is performed using a predicted branch target address, during the higher-order memory access. If the predicted branch target address hits in the instruction cache during the prefetch operation, associated instructions are not retrieved, to conserve power. If the predicted branch target address misses in the instruction cache during the prefetch operation, a higher-order memory access may be launched, using the predicted branch instruction address. In either case, the first instruction address is re-loaded into the fetch stage pipeline to await the return of instructions from its higher-order memory access.

    摘要翻译: 处理器对非顺序指令地址执行预取操作。 如果第一指令地址在指令高速缓存中丢失并且作为获取操作的一部分访问高阶存储器,并且检测并预测与第一指令地址或第一指令地址之后的地址相关联的分支指令,则预取 在高级存储器访问期间使用预测的分支目标地址执行操作。 如果预取分支目标地址在预取操作期间在指令高速缓存中命中,则不检索相关联的指令以节省功率。 如果在预取操作期间预测的分支目标地址在指令高速缓存中丢失,则可以使用预测的分支指令地址来启动高阶存储器访问。 在任一种情况下,第一指令地址被重新加载到提取级流水线中以等待指令从其高阶存储器访问返回。

    Translation lookaside buffer (TLB) suppression for intra-page program counter relative or absolute address branch instructions
    3.
    发明授权
    Translation lookaside buffer (TLB) suppression for intra-page program counter relative or absolute address branch instructions 有权
    翻译后备缓冲器(TLB)抑制用于页内程序计数器相对或绝对地址分支指令

    公开(公告)号:US07406613B2

    公开(公告)日:2008-07-29

    申请号:US11003772

    申请日:2004-12-02

    IPC分类号: G06F1/26

    摘要: In a pipelined processor, a pre-decoder in advance of an instruction cache calculates the branch target address (BTA) of PC-relative and absolute address branch instructions. The pre-decoder compares the BTA with the branch instruction address (BIA) to determine whether the target and instruction are in the same memory page. A branch target same page (BTSP) bit indicating this is written to the cache and associated with the instruction. When the branch is executed and evaluated as taken, a TLB access to check permission attributes for the BTA is suppressed if the BTA is in the same page as the BIA, as indicated by the BTSP bit. This reduces power consumption as the TLB access is suppressed and the BTA/BIA comparison is only performed once, when the branch instruction is first fetched. Additionally, the pre-decoder removes the BTA/BIA comparison from the BTA generation and selection critical path.

    摘要翻译: 在流水线处理器中,在指令高速缓存之前的预解码器计算PC相对的分支目标地址(BTA)和绝对地址分支指令。 预解码器将BTA与分支指令地址(BIA)进行比较,以确定目标和指令是否在相同的存储器页面中。 指示这一点的分支目标相同页(BTSP)位被写入高速缓存并与指令相关联。 当分支被执行并被评估时,如果BTA与BIA在同一个页面中,如BTSP位所指示的那样,则抑制对BTA的许可属性的TLB访问被抑制。 当首先取出分支指令时,这样可以降低TLB访问的功耗,并且仅执行一次BTA / BIA比较。 另外,预解码器从BTA生成和选择关键路径去除BTA / BIA比较。

    Debug circuit comparing processor instruction set operating mode
    4.
    发明授权
    Debug circuit comparing processor instruction set operating mode 有权
    调试电路比较处理器指令集的工作模式

    公开(公告)号:US08352713B2

    公开(公告)日:2013-01-08

    申请号:US11463379

    申请日:2006-08-09

    IPC分类号: G06F9/48

    CPC分类号: G06F11/3648

    摘要: A processor is operative to execute two or more instruction sets, each in a different instruction set operating mode. As each instruction is executed, debug circuit comparison the current instruction set operating mode to a target instruction set operating mode sent by a programmer, and outputs an alert or indication in they match. The alert or indication may additionally be dependent upon the instruction address following within a predetermined target address range. The alert or indication may comprise a breakpoint signal that halts execution and/or it is output as an external signal of the processor. The instruction address at which the processor detects a match in the instruction set operating modes may additionally be output. Additionally or alternatively, the alert or indication may comprise starting or stopping a trace operation, causing an exception, or any other known debugger function.

    摘要翻译: 处理器可操作以执行两个或更多个指令集,每个指令集处于不同的指令集操作模式。 当执行每条指令时,调试电路将当前指令集操作模式与编程器发送的目标指令集操作模式进行比较,并输出其中的警报或指示。 警报或指示还可以依赖于在预定目标地址范围内的指令地址。 警报或指示可以包括停止执行的断点信号和/或作为处理器的外部信号输出的断点信号。 可以另外输出处理器在指令集操作模式中检测到匹配的指令地址。 附加地或替代地,警报或指示可以包括启动或停止跟踪操作,引起异常或任何其他已知的调试器功能。

    Sliding-window, block-based branch target address cache
    6.
    发明授权
    Sliding-window, block-based branch target address cache 有权
    滑动窗口,基于块的分支目标地址缓存

    公开(公告)号:US07827392B2

    公开(公告)日:2010-11-02

    申请号:US11422186

    申请日:2006-06-05

    IPC分类号: G06F9/00

    摘要: A sliding-window, block-based Branch Target Address Cache (BTAC) comprises a plurality of entries, each entry associated with a block of instructions containing at least one branch instruction having been evaluated taken, and having a tag associated with the address of the first instruction in the block. The blocks each correspond to a group of instructions fetched from memory, such as an I-cache. Where a branch instruction is included in two or more fetch groups, it is also included in two or more instruction blocks associated with BTAC entries. The sliding-window, block-based BTAC allows for storing the Branch Target Address (BTA) of two or more taken branch instructions that fall in the same instruction block, without providing for multiple BTA storage space in each BTAC entry, by storing BTAC entries associated with different instruction blocks, each containing at least one of the taken branch instructions.

    摘要翻译: 滑动窗口,基于块的分支目标地址高速缓存(BTAC)包括多个条目,每个条目与包含已被评估的至少一个分支指令的指令块相关联,并且具有与该地址相关联的标签 第一个指令在块中。 这些块各自对应于从存储器获取的一组指令,例如I缓存。 在两个或更多个取出组中包含分支指令的情况下,还包括在与BTAC条目相关联的两个或多个指令块中。 滑动窗口,基于块的BTAC允许存储落在同一指令块中的两个或更多个采取的分支指令的分支目标地址(BTA),而不需要在每个BTAC条目中提供多个BTA存储空间,通过存储BTAC条目 与不同的指令块相关联,每个指令块包含至少一个采取的分支指令。

    Pre-decode error handling via branch correction
    8.
    发明授权
    Pre-decode error handling via branch correction 有权
    通过分支校正预解码错误处理

    公开(公告)号:US07415638B2

    公开(公告)日:2008-08-19

    申请号:US10995858

    申请日:2004-11-22

    IPC分类号: G06F11/00 G06F9/30

    摘要: In a pipelined processor where instructions are pre-decoded prior to being stored in a cache, an incorrectly pre-decoded instruction is detected during execution in the pipeline. The corresponding instruction is invalidated in the cache, and the instruction is forced to evaluate as a branch instruction. In particular, the branch instruction is evaluated as “mispredicted not taken” with a branch target address of the incorrectly pre-decoded instruction's address. This, with the invalidated cache line, causes the incorrectly pre-decoded instruction to be re-fetched from memory with a precise address. The re-fetched instruction is then correctly pre-decoded, written to the cache, and executed.

    摘要翻译: 在流水线处理器中,在将存储在高速缓存中的指令进行预解码之前,在流水线执行期间检测到未正确预解码的指令。 相应的指令在缓存中无效,并且强制将该指令作为分支指令进行求值。 特别地,分支指令被评估为未被错误地预解码的指令的地址的分支目标地址“未被采用”。 这使得无效的高速缓存行导致错误地预解码的指令从具有精确地址的存储器重新获取。 然后重新获取的指令被正确预解码,写入高速缓存并执行。

    Method and apparatus for efficiently accessing first and second branch history tables to predict branch instructions
    10.
    发明授权
    Method and apparatus for efficiently accessing first and second branch history tables to predict branch instructions 有权
    用于有效地访问第一和第二分支历史表以预测分支指令的方法和装置

    公开(公告)号:US07278012B2

    公开(公告)日:2007-10-02

    申请号:US11144206

    申请日:2005-06-02

    IPC分类号: G06F9/42

    CPC分类号: G06F9/3844

    摘要: A microprocessor includes two branch history tables, and is configured to use a first one of the branch history tables for predicting branch instructions that are hits in a branch target cache, and to use a second one of the branch history tables for predicting branch instructions that are misses in the branch target cache. As such, the first branch history table is configured to have an access speed matched to that of the branch target cache, so that its prediction information is timely available relative to branch target cache hit detection, which may happen early in the microprocessor's instruction pipeline. The second branch history table thus need only be as fast as is required for providing timely prediction information in association with recognizing branch target cache misses as branch instructions, such as at the instruction decode stage(s) of the instruction pipeline.

    摘要翻译: 微处理器包括两个分支历史表,并且被配置为使用第一个分支历史表来预测分支目标高速缓存中的命中的分支指令,并且使用第二个分支历史表来预测分支指令, 在分支目标缓存中丢失。 因此,第一分支历史表被配置为具有与分支目标高速缓存的访问速度匹配的访问速度,使得其预测信息相对于可能在微处理器的指令流水线的早期发生的分支目标高速缓存命中检测而及时可用。 因此,第二分支历史表仅需要与将识别分支目标高速缓存未命中作为分支指令(例如在指令流水线的指令解码阶段)相关联地提供及时的预测信息所需的速度。