Method and apparatus for managing instruction flushing in a microprocessor's instruction pipeline
    31.
    发明授权
    Method and apparatus for managing instruction flushing in a microprocessor's instruction pipeline 有权
    用于在微处理器指令管线中管理指令冲洗的方法和装置

    公开(公告)号:US07949861B2

    公开(公告)日:2011-05-24

    申请号:US11149773

    申请日:2005-06-10

    摘要: In one or more embodiments, a processor includes one or more circuits to flush instructions from an instruction pipeline on a selective basis responsive to detecting a branch misprediction, such that those instructions marked as being dependent on the branch instruction associated with the branch misprediction are flushed. Thus, the one or more circuits may be configured to mark instructions fetched into the processor's instruction pipeline(s) to indicate their branch prediction dependencies, directly or indirectly detect incorrect branch predictions, and directly or indirectly flush instructions in the instruction pipeline(s) that are marked as being dependent on an incorrect branch prediction.

    摘要翻译: 在一个或多个实施例中,处理器包括一个或多个电路,用于响应于检测到分支错误预测而选择性地刷新来自指令流水线的指令,使得标记为依赖于与分支错误预测相关联的分支指令的那些指令被刷新 。 因此,一个或多个电路可以被配置为标记被提取到处理器的指令流水线中以指示其分支预测依赖性的指令,直接或间接地检测不正确的分支预测,以及直接或间接地刷新指令流水线中的指令, 被标记为依赖于不正确的分支预测。

    Segmented pipeline flushing for mispredicted branches
    32.
    发明授权
    Segmented pipeline flushing for mispredicted branches 有权
    分段管道冲洗错误预测的分支

    公开(公告)号:US07624254B2

    公开(公告)日:2009-11-24

    申请号:US11626443

    申请日:2007-01-24

    IPC分类号: G06F9/38

    摘要: A processor pipeline is segmented into an upper portion—prior to instructions going out of program order—and one or more lower portions beyond the upper portion. The upper pipeline is flushed upon detecting that a branch instruction was mispredicted, minimizing the delay in fetching of instructions from the correct branch target address. The lower pipelines may continue execution until the mispredicted branch instruction confirms, at which time all uncommitted instructions are flushed from the lower pipelines. Existing exception pipeline flushing mechanisms may be utilized, by adding a mispredicted branch identifier, reducing the complexity and hardware cost of flushing the lower pipelines.

    摘要翻译: 处理器管线在分配给程序顺序之外的指令之前被分割成上部,并且超出上部的一个或多个下部。 在检测到分支指令被错误预测时,上级流水线被刷新,从而使得从正确的分支目标地址获取指令的延迟最小化。 较低的管道可以继续执行,直到错误预测的分支指令确认,此时所有未提交的指令都从较低管道冲洗。 可以通过添加错误的分支标识符来减少冲洗下层管道的复杂性和硬件成本,来利用现有的异常流水线冲洗机制。

    Dynamic cache coherency snooper presence with variable snoop latency
    35.
    发明授权
    Dynamic cache coherency snooper presence with variable snoop latency 有权
    动态缓存一致性snooper存在与可变侦听延迟

    公开(公告)号:US06985972B2

    公开(公告)日:2006-01-10

    申请号:US10264163

    申请日:2002-10-03

    IPC分类号: G06F13/28 G06F12/00

    摘要: A data processing system with a snooper that is capable of dynamically enabling and disabling its snooping capabilities (i.e., snoop detect and response). The snooper is connected to a bus controller via a plurality of interconnects, including a snooperPresent signal, a snoop response signal and a snoop detect signal. When the snooperPresent signal is asserted, subsequent snoop requests are sent to the snooper, and the snooper is polled for a snoop response. Each snooper is capable of responding at different times (i.e., each snooper operates with different snoop latencies). The bus controller individually tracks the snoop response received from each snooper with the snooperPresent signal enabled. Whenever the snooper wishes to deactivate its snooping capabilities/operations, the snooper de-asserts the snooperPresent signal. The bus controller recognizes this as an indication that the snooper is unavailable. Thus, when the bus controller broadcasts subsequent snoop requests, the bus controller does not send the snoop request to the snooper.

    摘要翻译: 具有能够动态地启用和禁用其窥探能力(即,窥探检测和响应)的窥探者的数据处理系统。 窥探者通过多个互连连接到总线控制器,包括窥探信号,窥探响应信号和窥探检测信号。 当snooperPresent信号被断言时,后续的窥探请求被发送到snooper,并且窥探者被轮询以进行侦听响应。 每个窥探者都能够在不同的时间进行响应(即,每个窥探者使用不同的侦听延迟进行操作)。 总线控制器单独跟踪snooperPresent信号启用时从每个窥探者接收的窥探响应。 只要窥探者希望取消其窥探能力/操作,窥探者将断言snooperPresent信号。 总线控制器将此识别为snooper不可用的指示。 因此,当总线控制器广播后续的窥探请求时,总线控制器不向窥探者发送窥探请求。

    Auto-Ordering of Strongly Ordered, Device, and Exclusive Transactions Across Multiple Memory Regions
    36.
    发明申请
    Auto-Ordering of Strongly Ordered, Device, and Exclusive Transactions Across Multiple Memory Regions 有权
    在多个内存区域自动排序强顺序,设备和独占交易

    公开(公告)号:US20130151799A1

    公开(公告)日:2013-06-13

    申请号:US13315370

    申请日:2011-12-09

    IPC分类号: G06F12/00

    CPC分类号: G06F13/1621

    摘要: Efficient techniques are described for controlling ordered accesses in a weakly ordered storage system. A stream of memory requests is split into two or more streams of memory requests and a memory access counter is incremented for each memory request. A memory request requiring ordered memory accesses is identified in one of the two or more streams of memory requests. The memory request requiring ordered memory accesses is stalled upon determining a previous memory request from a different stream of memory requests is pending. The memory access counter is decremented for each memory request guaranteed to complete. A count value in the memory access counter that is different from an initialized state of the memory access counter indicates there are pending memory requests. The memory request requiring ordered memory accesses is processed upon determining there are no further pending memory requests.

    摘要翻译: 描述了用于控制弱订单存储系统中有序访问的高效技术。 存储器请求流被分成两个或更多个存储器请求流,并且每个存储器请求增加存储器访问计数器。 需要有序存储器访问的存储器请求在两个或更多个存储器请求流中的一个中被识别。 在从不同的存储器请求流确定先前的存储器请求正在等待时,需要有序存储器访问的存储器请求被停止。 对于保证完成的每个存储器请求,存储器访问计数器递减。 与存储器访问计数器的初始化状态不同的存储器访问计数器中的计数值指示存在未决存储器请求。 在确定没有进一步的未决存储器请求时,处理需要有序存储器访问的存储器请求。

    Apparatus and Methods to Reduce Castouts in a Multi-Level Cache Hierarchy
    37.
    发明申请
    Apparatus and Methods to Reduce Castouts in a Multi-Level Cache Hierarchy 有权
    在多级缓存层次结构中减少铸件的装置和方法

    公开(公告)号:US20120059995A1

    公开(公告)日:2012-03-08

    申请号:US13292651

    申请日:2011-11-09

    IPC分类号: G06F12/08

    摘要: Techniques and methods are used to reduce allocations to a higher level cache of cache lines displaced from a lower level cache. The allocations of the displaced cache lines are prevented for displaced cache lines that are determined to be redundant in the next level cache, whereby castouts are reduced. To such ends, a line is selected to be displaced in a lower level cache. Information associated with the selected line is identified which indicates that the selected line is present in a higher level cache. An allocation of the selected line in the higher level cache is prevented based on the identified information. Preventing an allocation of the selected line saves power that would be associated with the allocation.

    摘要翻译: 技术和方法用于减少从较低级别缓存中移位的高速缓存行的更高级缓存的分配。 对于在下一级高速缓存中被确定为冗余的移位高速缓存线,防止移位的高速缓存行的分配,从而减少了突发。 为此,选择在下一级缓存中移位的行。 识别与所选行相关联的信息,其指示所选择的行存在于较高级别的高速缓存中。 基于所识别的信息来防止在较高级别高速缓存中的所选行的分配。 防止所选线路的分配节省与分配相关联的功率。

    Sliding-window, block-based branch target address cache
    38.
    发明授权
    Sliding-window, block-based branch target address cache 有权
    滑动窗口,基于块的分支目标地址缓存

    公开(公告)号:US07827392B2

    公开(公告)日:2010-11-02

    申请号:US11422186

    申请日:2006-06-05

    IPC分类号: G06F9/00

    摘要: A sliding-window, block-based Branch Target Address Cache (BTAC) comprises a plurality of entries, each entry associated with a block of instructions containing at least one branch instruction having been evaluated taken, and having a tag associated with the address of the first instruction in the block. The blocks each correspond to a group of instructions fetched from memory, such as an I-cache. Where a branch instruction is included in two or more fetch groups, it is also included in two or more instruction blocks associated with BTAC entries. The sliding-window, block-based BTAC allows for storing the Branch Target Address (BTA) of two or more taken branch instructions that fall in the same instruction block, without providing for multiple BTA storage space in each BTAC entry, by storing BTAC entries associated with different instruction blocks, each containing at least one of the taken branch instructions.

    摘要翻译: 滑动窗口,基于块的分支目标地址高速缓存(BTAC)包括多个条目,每个条目与包含已被评估的至少一个分支指令的指令块相关联,并且具有与该地址相关联的标签 第一个指令在块中。 这些块各自对应于从存储器获取的一组指令,例如I缓存。 在两个或更多个取出组中包含分支指令的情况下,还包括在与BTAC条目相关联的两个或多个指令块中。 滑动窗口,基于块的BTAC允许存储落在同一指令块中的两个或更多个采取的分支指令的分支目标地址(BTA),而不需要在每个BTAC条目中提供多个BTA存储空间,通过存储BTAC条目 与不同的指令块相关联,每个指令块包含至少一个采取的分支指令。

    Instruction cache having fixed number of variable length instructions
    40.
    发明授权
    Instruction cache having fixed number of variable length instructions 有权
    指令缓存具有固定数量的可变长度指令

    公开(公告)号:US07568070B2

    公开(公告)日:2009-07-28

    申请号:US11193547

    申请日:2005-07-29

    IPC分类号: G06F9/34

    摘要: A fixed number of variable-length instructions are stored in each line of an instruction cache. The variable-length instructions are aligned along predetermined boundaries. Since the length of each instruction in the line, and hence the span of memory the instructions occupy, is not known, the address of the next following instruction is calculated and stored with the cache line. Ascertaining the instruction boundaries, aligning the instructions, and calculating the next fetch address are performed in a predecoder prior to placing the instructions in the cache.

    摘要翻译: 固定数量的可变长度指令存储在指令高速缓存的每一行中。 可变长度指令沿预定边界排列。 由于行中的每条指令的长度以及指令占用的存储器的跨度是未知的,所以下一个跟随指令的地址被计算并与高速缓存行一起存储。 在将指令置于高速缓存之前,确定指令边界,对准指令并计算下一个提取地址在预解码器中执行。