PROCESSOR WITH SECOND JUMP EXECUTION UNIT FOR BRANCH MISPREDICTION
    4.
    发明申请
    PROCESSOR WITH SECOND JUMP EXECUTION UNIT FOR BRANCH MISPREDICTION 审中-公开
    具有分支机构错误预测的第二个执行单元的处理程序

    公开(公告)号:US20140195790A1

    公开(公告)日:2014-07-10

    申请号:US13994676

    申请日:2011-12-28

    IPC分类号: G06F9/38

    摘要: A secondary jump execution unit (JEU) is incorporated in a micro-processor to operate concurrently with a primary JEU, enabling the execution of simultaneous branch operations with possible detection of multiple branch mispredicts. When branch operations are executed on both JEUs in a same instruction cycle, mispredict processing for the secondary JEU is skidded into the primary JEU's dispatch pipeline such that the branch processing for the secondary JEU occurs after processing of the branch for the primary JEU and while the primary JEU is not processing a branch. Moreover, in cases when a nuke command is also received from a reorder buffer of the processor, the branch processing for the secondary JEU is further delayed to accommodate processing of the nuke on the primary JEU. Further embodiments support the promotion of the secondary JEU to have access to the mispredict mechanisms of the primary JEU in certain circumstances.

    摘要翻译: 次级跳转执行单元(JEU)并入微处理器以与主JEU同时操作,使得能够执行同时分支操作,并可能检测到多个分支错误预测。 当在同一个指令周期中对两个JEU执行分支操作时,辅助JEU的错误预测处理被划分到主JEU的调度流水线中,使得辅助JEU的分支处理在主JEU的分支处理之后发生,而 初级JEU不处理分支。 此外,在从处理器的重新排序缓冲器接收到nuke命令的情况下,进一步延迟用于辅助JEU的分支处理,以适应主JEU上的nuke的处理。 进一步的实施方案支持促进联合联合国次级方案在某些情况下获得主要联合执行机构的错误预测机制。

    Enhanced loop streaming detector to drive logic optimization
    6.
    发明授权
    Enhanced loop streaming detector to drive logic optimization 有权
    增强循环流检测器驱动逻辑优化

    公开(公告)号:US09354875B2

    公开(公告)日:2016-05-31

    申请号:US13728273

    申请日:2012-12-27

    IPC分类号: G06F9/30

    摘要: An enhanced loop streaming detection mechanism is provided in a processor to reduce power consumption. The processor includes a decoder to decode instructions in a loop into micro-operations, and a loop streaming detector to detect the presence of the loop in the micro-operations. The processor also includes a loop characteristic tracker unit to identify hardware components downstream from the decoder that are not to be used by the micro-operations in the loop, and to disable the identified hardware components. The processor also includes execution circuitry to execute the micro-operations in the loop with the identified hardware components disabled.

    摘要翻译: 在处理器中提供增强的循环流检测机制以降低功耗。 处理器包括解码器,用于将循环中的指令解码为微操作,以及循环流检测器,用于检测微操作中环路的存在。 处理器还包括循环特性跟踪器单元,用于识别解码器下游的不被循环中的微操作使用的硬件组件,以及禁用所识别的硬件组件。 该处理器还包括执行电路,以在所识别的硬件组件被禁用的情况下执行循环中的微操作。

    Hiding instruction cache miss latency by running tag lookups ahead of the instruction accesses
    7.
    发明授权
    Hiding instruction cache miss latency by running tag lookups ahead of the instruction accesses 有权
    通过在指令访问之前运行标签查找来隐藏指令缓存未命中延迟

    公开(公告)号:US09158696B2

    公开(公告)日:2015-10-13

    申请号:US13992228

    申请日:2011-12-29

    IPC分类号: G06F12/08 G06F9/38

    摘要: This disclosure provides techniques and apparatuses to enable early, run-ahead handling of IC and ITLB misses by decoupling the ITLB and IC tag lookups from the IC data (instruction bytes) accesses, and making ITLB and IC tag lookups run ahead of the IC data accesses. This allows overlapping the ITLB and IC miss stall cycles with older instruction byte reads or older IC misses, resulting in fewer stalls than previous implementations and improved performance

    摘要翻译: 本公开提供了通过将ITLB和IC标签查找与IC数据(指令字节)访问分离并使ITLB和IC标签查找在IC数据之前运行来实现IC和ITLB未命中的早期,预先处理的技术和装置 访问 这允许ITLB和IC错过停顿周期与旧的指令字节读取或较旧的IC错误重叠,导致比以前的实现更少的停顿和改进的性能

    METHOD, APPARATUS, AND SYSTEM FOR TRANSACTIONAL SPECULATION CONTROL INSTRUCTIONS
    8.
    发明申请
    METHOD, APPARATUS, AND SYSTEM FOR TRANSACTIONAL SPECULATION CONTROL INSTRUCTIONS 审中-公开
    方法,装置和系统的交互式分析控制指令

    公开(公告)号:US20150032998A1

    公开(公告)日:2015-01-29

    申请号:US13997243

    申请日:2012-02-02

    IPC分类号: G06F9/30

    摘要: An apparatus and method is described herein for providing speculation control instructions. An xAcquire and xRelease instruction are provided to define a critical section. In one embodiment, the xAcquire instruction includes a lock instruction with an elision prefix and the xRelease instruction includes a lock release instruction with an elision prefix. As a result, a processor is able to elide locks and transactionally execute a critical section defined in software by xAcquire and xRelease. But by adding only prefix hints, legacy processor are able to execute the same code by just ignoring the hints and executing the critical section traditionally with locks to guarantee mutual exclusion. Moreover, xBegin and xEnd are similarly provided for in an Instruction Set Architecture (ISA) to define a transactional code region. In addition, other control speculation instructions, such as xAbort to enable explicit abort of a critical or transactional code section and xTest to test a state of speculative execution is also provided in the ISA.

    摘要翻译: 这里描述了一种用于提供猜测控制指令的装置和方法。 提供xAcquire和xRelease指令来定义关键部分。 在一个实施例中,xAcquire指令包括具有检验前缀的锁定指令,并且xRelease指令包括具有检验前缀的锁定释放指令。 因此,处理器能够通过xAcquire和xRelease来删除锁定和事务性地执行在软件中定义的关键部分。 但是通过仅添加前缀提示,传统处理器能够通过忽略提示并执行传统的锁定关键部分来保证互斥,从而执行相同的代码。 此外,xBegin和xEnd在指令集架构(ISA)中类似地提供以定义事务代码区域。 此外,还在ISA中提供了其他控制推测指令,例如xAbort,以实现关键或事务代码段的显示中止,以及xTest测试推测执行状态。

    HIDING INSTRUCTION CACHE MISS LATENCY BY RUNNING TAG LOOKUPS AHEAD OF THE INSTRUCTION ACCESSES
    9.
    发明申请
    HIDING INSTRUCTION CACHE MISS LATENCY BY RUNNING TAG LOOKUPS AHEAD OF THE INSTRUCTION ACCESSES 有权
    隐藏指令高速缓存通过运行TAG LOOKUPS之前的指令访问失败

    公开(公告)号:US20140229677A1

    公开(公告)日:2014-08-14

    申请号:US13992228

    申请日:2011-12-29

    IPC分类号: G06F12/08

    摘要: This disclosure provides techniques and apparatuses to enable early, run-ahead handling of IC and ITLB misses by decoupling the ITLB and IC tag lookups from the IC data (instruction bytes) accesses, and making ITLB and IC tag lookups run ahead of the IC data accesses. This allows overlapping the ITLB and IC miss stall cycles with older instruction byte reads or older IC misses, resulting in fewer stalls than previous implementations and improved performance

    摘要翻译: 本公开提供了通过将ITLB和IC标签查找与IC数据(指令字节)访问分离并使ITLB和IC标签查找在IC数据之前运行来实现IC和ITLB未命中的早期,预先处理的技术和装置 访问 这允许ITLB和IC错过停顿周期与旧的指令字节读取或较旧的IC错误重叠,导致比以前的实现更少的停顿和改进的性能