Pre-decode error handling via branch correction
    1.
    发明授权
    Pre-decode error handling via branch correction 有权
    通过分支校正预解码错误处理

    公开(公告)号:US07415638B2

    公开(公告)日:2008-08-19

    申请号:US10995858

    申请日:2004-11-22

    IPC分类号: G06F11/00 G06F9/30

    摘要: In a pipelined processor where instructions are pre-decoded prior to being stored in a cache, an incorrectly pre-decoded instruction is detected during execution in the pipeline. The corresponding instruction is invalidated in the cache, and the instruction is forced to evaluate as a branch instruction. In particular, the branch instruction is evaluated as “mispredicted not taken” with a branch target address of the incorrectly pre-decoded instruction's address. This, with the invalidated cache line, causes the incorrectly pre-decoded instruction to be re-fetched from memory with a precise address. The re-fetched instruction is then correctly pre-decoded, written to the cache, and executed.

    摘要翻译: 在流水线处理器中,在将存储在高速缓存中的指令进行预解码之前,在流水线执行期间检测到未正确预解码的指令。 相应的指令在缓存中无效,并且强制将该指令作为分支指令进行求值。 特别地,分支指令被评估为未被错误地预解码的指令的地址的分支目标地址“未被采用”。 这使得无效的高速缓存行导致错误地预解码的指令从具有精确地址的存储器重新获取。 然后重新获取的指令被正确预解码,写入高速缓存并执行。

    Method and apparatus for efficiently accessing first and second branch history tables to predict branch instructions
    2.
    发明授权
    Method and apparatus for efficiently accessing first and second branch history tables to predict branch instructions 有权
    用于有效地访问第一和第二分支历史表以预测分支指令的方法和装置

    公开(公告)号:US07278012B2

    公开(公告)日:2007-10-02

    申请号:US11144206

    申请日:2005-06-02

    IPC分类号: G06F9/42

    CPC分类号: G06F9/3844

    摘要: A microprocessor includes two branch history tables, and is configured to use a first one of the branch history tables for predicting branch instructions that are hits in a branch target cache, and to use a second one of the branch history tables for predicting branch instructions that are misses in the branch target cache. As such, the first branch history table is configured to have an access speed matched to that of the branch target cache, so that its prediction information is timely available relative to branch target cache hit detection, which may happen early in the microprocessor's instruction pipeline. The second branch history table thus need only be as fast as is required for providing timely prediction information in association with recognizing branch target cache misses as branch instructions, such as at the instruction decode stage(s) of the instruction pipeline.

    摘要翻译: 微处理器包括两个分支历史表,并且被配置为使用第一个分支历史表来预测分支目标高速缓存中的命中的分支指令,并且使用第二个分支历史表来预测分支指令, 在分支目标缓存中丢失。 因此,第一分支历史表被配置为具有与分支目标高速缓存的访问速度匹配的访问速度,使得其预测信息相对于可能在微处理器的指令流水线的早期发生的分支目标高速缓存命中检测而及时可用。 因此,第二分支历史表仅需要与将识别分支目标高速缓存未命中作为分支指令(例如在指令流水线的指令解码阶段)相关联地提供及时的预测信息所需的速度。

    Method and apparatus for managing cache partitioning using a dynamic boundary
    3.
    发明授权
    Method and apparatus for managing cache partitioning using a dynamic boundary 有权
    使用动态边界管理缓存分区的方法和装置

    公开(公告)号:US07650466B2

    公开(公告)日:2010-01-19

    申请号:US11233575

    申请日:2005-09-21

    IPC分类号: G06F12/00

    CPC分类号: G06F12/126

    摘要: A method of managing cache partitions provides a first pointer for higher priority writes and a second pointer for lower priority writes, and uses the first pointer to delimit the lower priority writes. For example, locked writes have greater priority than unlocked writes, and a first pointer may be used for locked writes, and a second pointer may be used for unlocked writes. The first pointer is advanced responsive to making locked writes, and its advancement thus defines a locked region and an unlocked region. The second pointer is advanced responsive to making unlocked writes. The second pointer also is advanced (or retreated) as needed to prevent it from pointing to locations already traversed by the first pointer. Thus, the pointer delimits the unlocked region and allows the locked region to grow at the expense of the unlocked region.

    摘要翻译: 管理高速缓存分区的方法提供用于较高优先级写入的第一指针和用于较低优先级写入的第二指针,并且使用第一指针来划分较低优先级的写入。 例如,锁定的写入具有比解锁的写入更高的优先级,并且第一指针可以用于锁定的写入,并且第二指针可以用于解锁的写入。 响应于锁定写入,第一指针是高级的,并且其进步因此定义了锁定区域和解锁区域。 响应于解锁写入,第二个指针是高级的。 第二个指针也根据需要进行高级(或撤销),以防止它指向已经被第一个指针所遍历的位置。 因此,指针限定未锁定区域,并允许锁定区域以解锁区域为代价而增长。

    Power saving methods and apparatus to selectively enable cache bits based on known processor state
    4.
    发明授权
    Power saving methods and apparatus to selectively enable cache bits based on known processor state 有权
    省电方法和装置,用于基于已知的处理器状态选择性地启用高速缓存位

    公开(公告)号:US07421568B2

    公开(公告)日:2008-09-02

    申请号:US11073284

    申请日:2005-03-04

    IPC分类号: G06F9/30

    摘要: A processor capable of fetching and executing variable length instructions is described having instructions of at least two lengths. The processor operates in multiple modes. One of the modes restricts instructions that can be fetched and executed to the longer length instructions. An instruction cache is used for storing variable length instructions and their associated predecode bit fields in an instruction cache line and storing the instruction address and processor operating mode state information at the time of the fetch in a tag line. The processor operating mode state information indicates the program specified mode of operation of the processor. The processor fetches instructions from the instruction cache for execution. As a result of an instruction fetch operation, the instruction cache may selectively enable the writing of predecode bit fields in the instruction cache and may selectively enable the reading of predecode bit fields stored in the instruction cache based on the processor state at the time of the fetch.

    摘要翻译: 描述具有至少两个长度的指令的能够获取和执行可变长度指令的处理器。 处理器以多种模式运行。 其中一种模式限制了可以获取并执行到较长长度指令的指令。 指令高速缓存用于在指令高速缓存行中存储可变长度指令及其相关联的预解码位字段,并且在获取标签行时存储指令地址和处理器操作模式状态信息。 处理器操作模式状态信息指示处理器的程序指定的操作模式。 处理器从指令缓存器中获取指令以执行。 作为指令提取操作的结果,指令高速缓存可以选择性地启用指令高速缓存中的预解码位字段的写入,并且可以基于处理器状态来选择性地启用存储在指令高速缓存中的预解码位字段的读取 取。

    Caching memory attribute indicators with cached memory data field
    5.
    发明授权
    Caching memory attribute indicators with cached memory data field 有权
    使用缓存的内存数据字段缓存内存属性指示器

    公开(公告)号:US07805588B2

    公开(公告)日:2010-09-28

    申请号:US11254873

    申请日:2005-10-20

    IPC分类号: G06F12/00

    摘要: A processing system may include a memory configured to store data in a plurality of pages, a TLB, and a memory cache including a plurality of cache lines. Each page in the memory may include a plurality of lines of memory. The memory cache may permit, when a virtual address is presented to the cache, a matching cache line to be identified from the plurality of cache lines, the matching cache line having a matching address that matches the virtual address. The memory cache may be configured to permit one or more page attributes of a page located at the matching address to be retrieved from the memory cache and not from the TLB, by further storing in each one of the cache lines a page attribute of the line of data stored in the cache line.

    摘要翻译: 处理系统可以包括被配置为在多个页面中存储数据的存储器,TLB和包括多个高速缓存行的存储器高速缓存。 存储器中的每个页面可以包括多行存储器。 当虚拟地址被呈现给高速缓存时,存储器高速缓存可以允许要从多条高速缓存行识别的匹配高速缓存行,匹配高速缓存行具有与虚拟地址匹配的匹配地址。 存储器高速缓存可以被配置为允许通过在高速缓存行的每一个中存储行的页面属性来允许位于匹配地址的页面的一个或多个页面属性从存储器高速缓存而不是从TLB检索, 的数据存储在缓存行中。

    Translation lookaside buffer (TLB) suppression for intra-page program counter relative or absolute address branch instructions
    6.
    发明授权
    Translation lookaside buffer (TLB) suppression for intra-page program counter relative or absolute address branch instructions 有权
    翻译后备缓冲器(TLB)抑制用于页内程序计数器相对或绝对地址分支指令

    公开(公告)号:US07406613B2

    公开(公告)日:2008-07-29

    申请号:US11003772

    申请日:2004-12-02

    IPC分类号: G06F1/26

    摘要: In a pipelined processor, a pre-decoder in advance of an instruction cache calculates the branch target address (BTA) of PC-relative and absolute address branch instructions. The pre-decoder compares the BTA with the branch instruction address (BIA) to determine whether the target and instruction are in the same memory page. A branch target same page (BTSP) bit indicating this is written to the cache and associated with the instruction. When the branch is executed and evaluated as taken, a TLB access to check permission attributes for the BTA is suppressed if the BTA is in the same page as the BIA, as indicated by the BTSP bit. This reduces power consumption as the TLB access is suppressed and the BTA/BIA comparison is only performed once, when the branch instruction is first fetched. Additionally, the pre-decoder removes the BTA/BIA comparison from the BTA generation and selection critical path.

    摘要翻译: 在流水线处理器中,在指令高速缓存之前的预解码器计算PC相对的分支目标地址(BTA)和绝对地址分支指令。 预解码器将BTA与分支指令地址(BIA)进行比较,以确定目标和指令是否在相同的存储器页面中。 指示这一点的分支目标相同页(BTSP)位被写入高速缓存并与指令相关联。 当分支被执行并被评估时,如果BTA与BIA在同一个页面中,如BTSP位所指示的那样,则抑制对BTA的许可属性的TLB访问被抑制。 当首先取出分支指令时,这样可以降低TLB访问的功耗,并且仅执行一次BTA / BIA比较。 另外,预解码器从BTA生成和选择关键路径去除BTA / BIA比较。

    Sliding-window, block-based branch target address cache
    7.
    发明授权
    Sliding-window, block-based branch target address cache 有权
    滑动窗口,基于块的分支目标地址缓存

    公开(公告)号:US07827392B2

    公开(公告)日:2010-11-02

    申请号:US11422186

    申请日:2006-06-05

    IPC分类号: G06F9/00

    摘要: A sliding-window, block-based Branch Target Address Cache (BTAC) comprises a plurality of entries, each entry associated with a block of instructions containing at least one branch instruction having been evaluated taken, and having a tag associated with the address of the first instruction in the block. The blocks each correspond to a group of instructions fetched from memory, such as an I-cache. Where a branch instruction is included in two or more fetch groups, it is also included in two or more instruction blocks associated with BTAC entries. The sliding-window, block-based BTAC allows for storing the Branch Target Address (BTA) of two or more taken branch instructions that fall in the same instruction block, without providing for multiple BTA storage space in each BTAC entry, by storing BTAC entries associated with different instruction blocks, each containing at least one of the taken branch instructions.

    摘要翻译: 滑动窗口,基于块的分支目标地址高速缓存(BTAC)包括多个条目,每个条目与包含已被评估的至少一个分支指令的指令块相关联,并且具有与该地址相关联的标签 第一个指令在块中。 这些块各自对应于从存储器获取的一组指令,例如I缓存。 在两个或更多个取出组中包含分支指令的情况下,还包括在与BTAC条目相关联的两个或多个指令块中。 滑动窗口,基于块的BTAC允许存储落在同一指令块中的两个或更多个采取的分支指令的分支目标地址(BTA),而不需要在每个BTAC条目中提供多个BTA存储空间,通过存储BTAC条目 与不同的指令块相关联,每个指令块包含至少一个采取的分支指令。