Multithreaded clustered microarchitecture with dynamic back-end assignment

    公开(公告)号:US08423716B2

    公开(公告)日:2013-04-16

    申请号:US13184424

    申请日:2011-07-15

    IPC分类号: G06F12/00 G06F13/00 G06F3/00

    摘要: A multithreaded clustered microarchitecture with dynamic back-end assignment is presented. A processing system may include a plurality of instruction caches and front-end units each to process an individual thread from a corresponding one of the instruction caches, a plurality of back-end units, and an interconnect network to couple the front-end and back-end units. A method may include measuring a performance metric of a back-end unit, comparing the measurement to a first value, and reassigning, or not, the back-end unit according to the comparison. Computer systems according to embodiments of the invention may include: a random access memory; a system bus; and a processor having a plurality of instruction caches, a plurality of front-end units each to process an individual thread from a corresponding one of the instruction caches; a plurality of back-end units; and an interconnect network coupled to the plurality of front-end units and the plurality of back-end units.

    Multithreaded clustered microarchitecture with dynamic back-end assignment
    32.
    发明授权
    Multithreaded clustered microarchitecture with dynamic back-end assignment 有权
    具有动态后端分配的多线程集群微架构

    公开(公告)号:US07996617B2

    公开(公告)日:2011-08-09

    申请号:US12351780

    申请日:2009-01-09

    IPC分类号: G06F12/00 G06F13/00 G06F3/00

    摘要: A multithreaded clustered microarchitecture with dynamic back-end assignment is presented. A processing system may include a plurality of instruction caches and front-end units each to process an individual thread from a corresponding one of the instruction caches, a plurality of back-end units, and an interconnect network to couple the front-end and back-end units. A method may include measuring a performance metric of a back-end unit, comparing the measurement to a first value, and reassigning, or not, the back-end unit according to the comparison. Computer systems according to embodiments of the invention may include: a random access memory; a system bus; and a processor having a plurality of instruction caches, a plurality of front-end units each to process an individual thread from a corresponding one of the instruction caches; a plurality of back-end units; and an interconnect network coupled to the plurality of front-end units and the plurality of back-end units.

    摘要翻译: 提出了具有动态后端分配的多线程集群微架构。 处理系统可以包括多个指令高速缓存和前端单元,每个指令高速缓存和前端单元各自处理来自指令高速缓存中的相应一个的单独线程,多个后端单元和互连网络以耦合前端和后端 -end单位。 方法可以包括测量后端单元的性能度量,将测量与第一值进行比较,以及根据比较重新分配或不再分配后端单元。 根据本发明的实施例的计算机系统可以包括:随机存取存储器; 系统总线 以及具有多个指令高速缓存的处理器,多个前端单元,每个前端单元各自处理来自所述指令高速缓存中的相应一个的各个线程; 多个后端单元; 以及耦合到所述多个前端单元和所述多个后端单元的互连网络。

    Apparatus for an energy efficient clustered micro-architecture
    33.
    发明授权
    Apparatus for an energy efficient clustered micro-architecture 失效
    用于能量效率的集群微架构的装置

    公开(公告)号:US07657766B2

    公开(公告)日:2010-02-02

    申请号:US11698612

    申请日:2007-01-26

    IPC分类号: G06F1/32

    CPC分类号: G06F9/3885 G06F9/3891

    摘要: In some embodiments, an apparatus for an energy efficient clustered micro-architecture are disclosed. In one embodiment, the micro-architecture computes an energy delay2 product for each active instruction scheduler and one or more associated function blocks of a current architecture configuration over a predetermined period. Once the energy delay2 product is computed, the computed product is compared against an energy delay2 product calculated for a prior architecture configuration to determine an effectiveness (energy efficiency) of the current architecture configuration. Based on the effectiveness of the current architecture configuration, a number of active instruction schedulers and one or more associated functional blocks within the current architecture configuration is adjusted. In one embodiment, the number of active instruction schedulers and one or more associated functional blocks may be increased or decreased to improve power efficiency of the cluster micro-architecture. Other embodiments are described and claimed.

    摘要翻译: 在一些实施例中,公开了一种用于能量效率的集群微架构的装置。 在一个实施例中,微架构在预定时段内为每个活动指令调度器和当前体系结构配置的一个或多个相关功能块计算能量延迟2乘积。 一旦计算了energy delay2产品,将计算的产品与针对先前架构配置计算的能量延迟2乘积进行比较,以确定当前架构配置的有效性(能效)。 基于当前架构配置的有效性,调整当前架构配置中的多个主动指令调度器和一个或多个相关联的功能块。 在一个实施例中,可以增加或减少活动指令调度器和一个或多个相关联的功能块的数量,以提高集群微架构的功率效率。 描述和要求保护其他实施例。

    METHOD AND APPARATUS FOR SELECTION AMONG MULTIPLE EXECUTION THREADS
    34.
    发明申请
    METHOD AND APPARATUS FOR SELECTION AMONG MULTIPLE EXECUTION THREADS 审中-公开
    多种执行螺纹选择的方法和装置

    公开(公告)号:US20080163230A1

    公开(公告)日:2008-07-03

    申请号:US11618571

    申请日:2006-12-29

    IPC分类号: G06F9/50

    CPC分类号: G06F9/524

    摘要: Methods and apparatus for selecting and prioritizing execution threads for consideration of resource allocation include eliminating threads for consideration from all the running execution threads: if they have no available entries in their associated reorder buffers, or if they have exceeded their threshold for entry allocations in the issue window, or if they have exceeded their threshold for register allocations in some register file and if that register file also has an insufficient number of available registers to satisfy the requirements of the other running execution threads. Issue window thresholds may be dynamically computed by dividing the current number of entries by the number of threads under consideration. Register thresholds may also be dynamically computed and associated with a thread and a register file. Execution threads remaining under consideration can be prioritized according to how many combined entries the thread occupies in the resource allocation stage and the issue window.

    摘要翻译: 用于选择和优先处理执行线程以考虑资源分配的方法和装置包括从所有正在运行的执行线程中消除线程以供考虑:如果它们在其关联的重新排序缓冲器中没有可用的条目,或者如果它们已经超过了其中的条目分配的阈值 或者如果它们已经超过了某些寄存器文件中的寄存器分配阈值,并且该寄存器文件中的可用寄存器数量不足以满足其他正在运行的执行线程的要求。 可以通过将当前的条目数除以所考虑的线程数来动态计算发出窗口阈值。 寄存器阈值也可以动态计算并与线程和寄存器文件相关联。 可以根据线程在资源分配阶段和问题窗口中占用多少组合条目来优先考虑剩余的待处理线程。

    Apparatus and method for an energy efficient clustered micro-architecture
    35.
    发明授权
    Apparatus and method for an energy efficient clustered micro-architecture 失效
    一种节能型集群微架构的装置和方法

    公开(公告)号:US07194643B2

    公开(公告)日:2007-03-20

    申请号:US10673955

    申请日:2003-09-29

    IPC分类号: G06F1/32

    CPC分类号: G06F9/3885 G06F9/3891

    摘要: In some embodiments, a method and apparatus for an energy efficient clustered micro-architecture are disclosed. In one embodiment, the method includes the computation of an energy delay2 product for each active instruction scheduler and one or more associated function blocks of a current architecture configuration over a predetermined period. Once the energy delay2 product is computed, the computed product is compared against an energy delay2 product calculated for a prior architecture configuration to determine an effectiveness of the current architecture configuration. Based on the effectiveness of the current architecture configuration, a number of active instruction schedulers and one or more associated functional blocks within the current architecture configuration is adjusted. In one embodiment, the number of active instruction schedulers and one or more associated functional blocks may be increased or decreased to improve power efficiency of the cluster micro-architecture. Other embodiments are described and claimed.

    摘要翻译: 在一些实施例中,公开了一种用于能量效率的集群微架构的方法和装置。 在一个实施例中,该方法包括在预定时段内为每个活动指令调度器和当前体系结构配置的一个或多个相关功能块计算能量延迟产品。 一旦计算出能量延迟产品,则将所计算的产品与针对先前架构配置计算的能量延迟产品进行比较,以确定当前架构配置的有效性。 基于当前架构配置的有效性,调整当前架构配置中的多个主动指令调度器和一个或多个相关联的功能块。 在一个实施例中,可以增加或减少活动指令调度器和一个或多个相关联的功能块的数量,以提高集群微架构的功率效率。 描述和要求保护其他实施例。

    Multithreaded clustered microarchitecture with dynamic back-end assignment
    36.
    发明申请
    Multithreaded clustered microarchitecture with dynamic back-end assignment 失效
    具有动态后端分配的多线程集群微架构

    公开(公告)号:US20050262270A1

    公开(公告)日:2005-11-24

    申请号:US10851246

    申请日:2004-05-24

    IPC分类号: G06F1/32 G06F3/00 G06F9/38

    摘要: A multithreaded clustered microarchitecture with dynamic back-end assignment is presented. A processing system may include a plurality of instruction caches and front-end units each to process an individual thread from a corresponding one of the instruction caches, a plurality of back-end units, and an interconnect network to couple the front-end and back-end units. A method may include measuring a performance metric of a back-end unit, comparing the measurement to a first value, and reassigning, or not, the back-end unit according to the comparison. Computer systems according to embodiments of the invention may include: a random access memory; a system bus; and a processor having a plurality of instruction caches, a plurality of front-end units each to process an individual thread from a corresponding one of the instruction caches; a plurality of back-end units; and an interconnect network coupled to the plurality of front-end units and the plurality of back-end units.

    摘要翻译: 提出了具有动态后端分配的多线程集群微架构。 处理系统可以包括多个指令高速缓存和前端单元,每个指令高速缓存和前端单元各自处理来自指令高速缓存中的相应一个的单独线程,多个后端单元和互连网络以耦合前端和后端 -end单位。 方法可以包括测量后端单元的性能度量,将测量与第一值进行比较,以及根据比较重新分配或不再分配后端单元。 根据本发明的实施例的计算机系统可以包括:随机存取存储器; 系统总线 以及具有多个指令高速缓存的处理器,多个前端单元,每个前端单元各自处理来自所述指令高速缓存中的相应一个的各个线程; 多个后端单元; 以及耦合到所述多个前端单元和所述多个后端单元的互连网络。

    Apparatus and method for an energy efficient clustered micro-architecture
    38.
    发明申请
    Apparatus and method for an energy efficient clustered micro-architecture 失效
    一种节能型集群微架构的装置和方法

    公开(公告)号:US20050071694A1

    公开(公告)日:2005-03-31

    申请号:US10673955

    申请日:2003-09-29

    IPC分类号: G06F1/26 G06F9/38

    CPC分类号: G06F9/3885 G06F9/3891

    摘要: In some embodiments, a method and apparatus for an energy efficient clustered micro-architecture are disclosed. In one embodiment, the method includes the computation of an energy delay2 product for each active instruction scheduler and one or more associated function blocks of a current architecture configuration over a predetermined period. Once the energy delay2 product is computed, the computed product is compared against an energy delay2 product calculated for a prior architecture configuration to determine an effectiveness of the current architecture configuration. Based on the effectiveness of the current architecture configuration, a number of active instruction schedulers and one or more associated functional blocks within the current architecture configuration is adjusted. In one embodiment, the number of active instruction schedulers and one or more associated functional blocks may be increased or decreased to improve power efficiency of the cluster micro-architecture. Other embodiments are described and claimed.

    摘要翻译: 在一些实施例中,公开了一种用于能量效率的集群微架构的方法和装置。 在一个实施例中,该方法包括在预定时段内为每个活动指令调度器和当前体系结构配置的一个或多个相关功能块计算能量延迟2乘积。 一旦计算出能量延迟产品,则将计算出的产品与针对先前架构配置计算的能量延迟<2>乘积进行比较,以确定当前体系结构配置的有效性。 基于当前架构配置的有效性,调整当前架构配置中的多个主动指令调度器和一个或多个相关联的功能块。 在一个实施例中,可以增加或减少活动指令调度器和一个或多个相关联的功能块的数量,以提高集群微架构的功率效率。 描述和要求保护其他实施例。

    Apparatus for an energy efficient clustered micro-architecture
    39.
    发明申请
    Apparatus for an energy efficient clustered micro-architecture 失效
    用于能量效率的集群微架构的装置

    公开(公告)号:US20070124616A1

    公开(公告)日:2007-05-31

    申请号:US11698612

    申请日:2007-01-26

    IPC分类号: G06F1/00

    CPC分类号: G06F9/3885 G06F9/3891

    摘要: In some embodiments, an apparatus for an energy efficient clustered micro-architecture are disclosed. In one embodiment, the micro-architecture computes an energy delay2 product for each active instruction scheduler and one or more associated function blocks of a current architecture configuration over a predetermined period. Once the energy delay2 product is computed, the computed product is compared against an energy delay2 product calculated for a prior architecture configuration to determine an effectiveness (energy efficiency) of the current architecture configuration. Based on the effectiveness of the current architecture configuration, a number of active instruction schedulers and one or more associated functional blocks within the current architecture configuration is adjusted. In one embodiment, the number of active instruction schedulers and one or more associated functional blocks may be increased or decreased to improve power efficiency of the cluster micro-architecture. Other embodiments are described and claimed.

    摘要翻译: 在一些实施例中,公开了一种用于能量效率的集群微架构的装置。 在一个实施例中,微架构在预定时段内为每个活动指令调度器和当前体系结构配置的一个或多个相关功能块计算能量延迟产品。 一旦计算出能量延迟产品,则将计算出的产品与针对先前架构配置计算的能量延迟产品进行比较,以确定有效性(能量效率) 当前架构配置。 基于当前架构配置的有效性,调整当前架构配置中的多个主动指令调度器和一个或多个相关联的功能块。 在一个实施例中,可以增加或减少活动指令调度器和一个或多个相关联的功能块的数量,以提高集群微架构的功率效率。 描述和要求保护其他实施例。

    Branch pruning in architectures with speculation support
    40.
    发明授权
    Branch pruning in architectures with speculation support 有权
    在建筑支持下进行分支修剪

    公开(公告)号:US08813057B2

    公开(公告)日:2014-08-19

    申请号:US11695006

    申请日:2007-03-31

    IPC分类号: G06F9/45

    CPC分类号: G06F8/4441

    摘要: According to one example embodiment of the inventive subject matter, the method and apparatus described herein is used to generate an optimized speculative version of a static piece of code. The portion of code is optimized in the sense that the number of instructions executed will be smaller. However, since the applied optimization is speculative, the optimized version can be incorrect and some mechanism to recover from that situation is required. Thus, the quality of the produced code will be measured by taking into account both the final length of the code as well as the frequency of misspeculation.

    摘要翻译: 根据本发明主题的一个示例性实施例,本文描述的方法和装置用于生成静态代码片段的优化的推测版本。 在部分代码被优化的意义上,执行的指令数量将会更小。 然而,由于应用的优化是推测性的,因此优化版本可能是不正确的,并且需要从那种情况恢复的一些机制。 因此,所产生的代码的质量将通过考虑代码的最终长度以及错误的频率来测量。