Decoupling the number of logical threads from the number of simultaneous physical threads in a processor
    1.
    发明授权
    Decoupling the number of logical threads from the number of simultaneous physical threads in a processor 有权
    从处理器中同时处理的物理线程的数量解耦逻辑线程数

    公开(公告)号:US07797683B2

    公开(公告)日:2010-09-14

    申请号:US10745527

    申请日:2003-12-29

    IPC分类号: G06F9/44

    CPC分类号: G06F9/485 G06F9/3851

    摘要: Systems and methods of managing threads provide for supporting a plurality of logical threads with a plurality of simultaneous physical threads in which the number of logical threads may be greater than or less than the number of physical threads. In one approach, each of the plurality of logical threads is maintained in one of a wait state, an active state, a drain state, and a stall state. A state machine and hardware sequencer can be used to transition the logical threads between states based on triggering events and whether or not an interruptible point has been encountered in the logical threads. The logical threads are scheduled on the physical threads to meet, for example, priority, performance or fairness goals. It is also possible to specify the resources that are available to each logical thread in order to meet these and other, goals. In one example, a single logical thread can speculatively use more than one physical thread, pending a selection of which physical thread should be committed.

    摘要翻译: 管理线程的系统和方法提供支持具有多个同时物理线程的多个逻辑线程,其中逻辑线程的数量可以大于或小于物理线程的数量。 在一种方法中,多个逻辑线程中的每一个维持在等待状态,活动状态,排出状态和失速状态之一。 可以使用状态机和硬件定序器来基于触发事件来转换状态之间的逻辑线程,以及是否在逻辑线程中遇到可中断点。 逻辑线程被安排在物理线程上以满足例如优先级,性能或公平性目标。 也可以指定每个逻辑线程可用的资源,以满足这些目标和其他目标。 在一个示例中,单个逻辑线程可以推测使用多个物理线程,等待选择要提交哪个物理线程。

    Forward-pass dead instruction identification and removal at run-time
    2.
    发明授权
    Forward-pass dead instruction identification and removal at run-time 失效
    在运行时前进死亡指令识别和删除

    公开(公告)号:US08291196B2

    公开(公告)日:2012-10-16

    申请号:US11323037

    申请日:2005-12-29

    IPC分类号: G06F9/00

    CPC分类号: G06F9/3832 G06F9/3838

    摘要: Apparatuses and methods for dead instruction identification are disclosed. In one embodiment, an apparatus includes an instruction buffer and a dead instruction identifier. The instruction buffer is to store an instruction stream having a single entry point and a single exit point. The dead instruction identifier is to identify dead instructions based on a forward pass through the instruction stream.

    摘要翻译: 公开了用于死指示识别的装置和方法。 在一个实施例中,一种装置包括指令缓冲器和死指令标识符。 指令缓冲器用于存储具有单个入口点和单个出口点的指令流。 死指令标识符是基于通过指令流的向前传递来识别死指令。

    Multilevel scheme for dynamically and statically predicting instruction resource utilization to generate execution cluster partitions
    3.
    发明授权
    Multilevel scheme for dynamically and statically predicting instruction resource utilization to generate execution cluster partitions 有权
    用于动态和静态预测指令资源利用率以生成执行集群分区的多级方案

    公开(公告)号:US07562206B2

    公开(公告)日:2009-07-14

    申请号:US11323043

    申请日:2005-12-30

    IPC分类号: G06F9/30

    摘要: Microarchitecture policies and structures to predict execution clusters and facilitate inter-cluster communication are disclosed. In disclosed embodiments, sequentially ordered instructions are decoded into micro-operations. Execution of one set of micro-operations is predicted to involve execution resources to perform memory access operations and inter-cluster communication, but not to perform branching operations. Execution of a second set of micro-operations is predicted to involve execution resources to perform branching operations but not to perform memory access operations. The micro-operations are partitioned for execution in accordance with these predictions, the first set of micro-operations to a first cluster of execution resources and the second set of micro-operations to a second cluster of execution resources. The first and second sets of micro-operations are executed out of sequential order and are retired to represent their sequential instruction ordering.

    摘要翻译: 公开了用于预测执行群集并促进群集间通信的微架构策略和结构。 在所公开的实施例中,顺序排序的指令被解码成微操作。 预计执行一组微操作涉及执行资源以执行存储器访问操作和集群间通信,但不执行分支操作。 预计第二组微操作的执行涉及执行资源以执行分支操作,但不执行存储器访问操作。 根据这些预测将微操作划分为执行,即第一组执行资源的第一组微操作和第二组执行资源的第二组微操作。 第一组和第二组微操作按顺序执行,并退出以表示其顺序指令排序。

    PROCESSOR WITH SECOND JUMP EXECUTION UNIT FOR BRANCH MISPREDICTION
    5.
    发明申请
    PROCESSOR WITH SECOND JUMP EXECUTION UNIT FOR BRANCH MISPREDICTION 审中-公开
    具有分支机构错误预测的第二个执行单元的处理程序

    公开(公告)号:US20140195790A1

    公开(公告)日:2014-07-10

    申请号:US13994676

    申请日:2011-12-28

    IPC分类号: G06F9/38

    摘要: A secondary jump execution unit (JEU) is incorporated in a micro-processor to operate concurrently with a primary JEU, enabling the execution of simultaneous branch operations with possible detection of multiple branch mispredicts. When branch operations are executed on both JEUs in a same instruction cycle, mispredict processing for the secondary JEU is skidded into the primary JEU's dispatch pipeline such that the branch processing for the secondary JEU occurs after processing of the branch for the primary JEU and while the primary JEU is not processing a branch. Moreover, in cases when a nuke command is also received from a reorder buffer of the processor, the branch processing for the secondary JEU is further delayed to accommodate processing of the nuke on the primary JEU. Further embodiments support the promotion of the secondary JEU to have access to the mispredict mechanisms of the primary JEU in certain circumstances.

    摘要翻译: 次级跳转执行单元(JEU)并入微处理器以与主JEU同时操作,使得能够执行同时分支操作,并可能检测到多个分支错误预测。 当在同一个指令周期中对两个JEU执行分支操作时,辅助JEU的错误预测处理被划分到主JEU的调度流水线中,使得辅助JEU的分支处理在主JEU的分支处理之后发生,而 初级JEU不处理分支。 此外,在从处理器的重新排序缓冲器接收到nuke命令的情况下,进一步延迟用于辅助JEU的分支处理,以适应主JEU上的nuke的处理。 进一步的实施方案支持促进联合联合国次级方案在某些情况下获得主要联合执行机构的错误预测机制。

    Multi-level tracking of in-use state of cache lines
    6.
    发明授权
    Multi-level tracking of in-use state of cache lines 有权
    多级跟踪缓存行的使用状态

    公开(公告)号:US09348591B2

    公开(公告)日:2016-05-24

    申请号:US13992729

    申请日:2011-12-29

    IPC分类号: G06F9/30 G06F9/38

    摘要: This disclosure includes tracking of in-use states of cache lines to improve throughput of pipelines and thus increase performance of processors. Access data for a number of sets of instructions stored in an instruction cache may be tracked using an in-use array in a first array until the data for one or more of those sets reach a threshold condition. A second array may then be used as the in-use array to track the sets of instructions after a micro-operation is inserted into the pipeline. When the micro-operation retires from the pipeline, the first array may be cleared. The process may repeat after the second array reaches the threshold condition. During the tracking, an in-use state for an instruction line may be detected by inspecting a corresponding bit in each of the arrays. Additional arrays may also be used to track the in-use state.

    摘要翻译: 该公开内容包括跟踪高速缓存行的使用状态以提高管道的吞吐量,从而提高处理器的性能。 可以使用第一阵列中的使用中的阵列跟踪存储在指令高速缓存中的多组指令的访问数据,直到这些集合中的一个或多个的数据达到阈值条件。 然后可以将第二阵列用作在使用中的阵列在将微操作插入流水线之后跟踪指令集。 当微操作从管道退出时,可以清除第一个阵列。 该过程可能在第二个阵列达到阈值条件之后重复。 在跟踪期间,可以通过检查每个阵列中的相应位来检测用于指令行的使用状态。 附加阵列也可用于跟踪使用状态。

    Hiding instruction cache miss latency by running tag lookups ahead of the instruction accesses
    7.
    发明授权
    Hiding instruction cache miss latency by running tag lookups ahead of the instruction accesses 有权
    通过在指令访问之前运行标签查找来隐藏指令缓存未命中延迟

    公开(公告)号:US09158696B2

    公开(公告)日:2015-10-13

    申请号:US13992228

    申请日:2011-12-29

    IPC分类号: G06F12/08 G06F9/38

    摘要: This disclosure provides techniques and apparatuses to enable early, run-ahead handling of IC and ITLB misses by decoupling the ITLB and IC tag lookups from the IC data (instruction bytes) accesses, and making ITLB and IC tag lookups run ahead of the IC data accesses. This allows overlapping the ITLB and IC miss stall cycles with older instruction byte reads or older IC misses, resulting in fewer stalls than previous implementations and improved performance

    摘要翻译: 本公开提供了通过将ITLB和IC标签查找与IC数据(指令字节)访问分离并使ITLB和IC标签查找在IC数据之前运行来实现IC和ITLB未命中的早期,预先处理的技术和装置 访问 这允许ITLB和IC错过停顿周期与旧的指令字节读取或较旧的IC错误重叠,导致比以前的实现更少的停顿和改进的性能

    HIDING INSTRUCTION CACHE MISS LATENCY BY RUNNING TAG LOOKUPS AHEAD OF THE INSTRUCTION ACCESSES
    8.
    发明申请
    HIDING INSTRUCTION CACHE MISS LATENCY BY RUNNING TAG LOOKUPS AHEAD OF THE INSTRUCTION ACCESSES 有权
    隐藏指令高速缓存通过运行TAG LOOKUPS之前的指令访问失败

    公开(公告)号:US20140229677A1

    公开(公告)日:2014-08-14

    申请号:US13992228

    申请日:2011-12-29

    IPC分类号: G06F12/08

    摘要: This disclosure provides techniques and apparatuses to enable early, run-ahead handling of IC and ITLB misses by decoupling the ITLB and IC tag lookups from the IC data (instruction bytes) accesses, and making ITLB and IC tag lookups run ahead of the IC data accesses. This allows overlapping the ITLB and IC miss stall cycles with older instruction byte reads or older IC misses, resulting in fewer stalls than previous implementations and improved performance

    摘要翻译: 本公开提供了通过将ITLB和IC标签查找与IC数据(指令字节)访问分离并使ITLB和IC标签查找在IC数据之前运行来实现IC和ITLB未命中的早期,预先处理的技术和装置 访问 这允许ITLB和IC错过停顿周期与旧的指令字节读取或较旧的IC错误重叠,导致比以前的实现更少的停顿和改进的性能

    MULTI-LEVEL TRACKING OF IN-USE STATE OF CACHE LINES
    9.
    发明申请
    MULTI-LEVEL TRACKING OF IN-USE STATE OF CACHE LINES 有权
    多级跟踪高速缓存线路的使用状态

    公开(公告)号:US20130275733A1

    公开(公告)日:2013-10-17

    申请号:US13992729

    申请日:2011-12-29

    IPC分类号: G06F9/30

    摘要: This disclosure includes tracking of in-use states of cache lines to improve throughput of pipelines and thus increase performance of processors. Access data for a number of sets of instructions stored in an instruction cache may be tracked using an in-use array in a first array until the data for one or more of those sets reach a threshold condition. A second array may then be used as the in-use array to track the sets of instructions after a micro-operation is inserted into the pipeline. When the micro-operation retires from the pipeline, the first array may be cleared. The process may repeat after the second array reaches the threshold condition. During the tracking, an in-use state for an instruction line may be detected by inspecting a corresponding bit in each of the arrays. Additional arrays may also be used to track the in-use state.

    摘要翻译: 该公开内容包括跟踪高速缓存行的使用状态以提高管道的吞吐量,从而提高处理器的性能。 可以使用第一阵列中的使用中的阵列跟踪存储在指令高速缓存中的多组指令的访问数据,直到这些集合中的一个或多个的数据达到阈值条件。 然后可以使用第二阵列作为使用中阵列,以便在将微操作插入流水线之后跟踪指令集。 当微操作从管道退出时,可以清除第一个阵列。 该过程可能在第二个阵列达到阈值条件之后重复。 在跟踪期间,可以通过检查每个阵列中的相应位来检测用于指令行的使用状态。 附加阵列也可用于跟踪使用状态。