Target-frequency based indirect jump prediction for high-performance processors
    1.
    发明授权
    Target-frequency based indirect jump prediction for high-performance processors 失效
    用于高性能处理器的基于目标频率的间接跳转预测

    公开(公告)号:US07870371B2

    公开(公告)日:2011-01-11

    申请号:US11957728

    申请日:2007-12-17

    CPC classification number: G06F9/3806 G06F9/322 G06F9/3844

    Abstract: A frequency-based prediction of indirect jumps executing in a computing environment is provided. Illustratively, a computing environment comprises a prediction engine that processes data representative of indirect jumps performed by the exemplary computing environment according to a selected frequency-based prediction paradigm. Operatively, the exemplary prediction engine can keep track of targets, in a table, taken for each indirect jump and program context (e.g., branch history and/or path information) of an exemplary computing program. Further, the prediction engine can also store a frequency counter associated with each target in the exemplary table. Illustratively, the frequency counter can record the number of times a target was taken in the recent past executions of an observed one or more indirect jump. The prediction engine can supply the target address of an indirect jump based on the values of the frequency counters of each stored target address.

    Abstract translation: 提供了在计算环境中执行的间接跳跃的基于频率的预测。 说明性地,计算环境包括预测引擎,该预测引擎根据所选择的基于频率的预测范例来处理表示由示例性计算环境执行的间接跳转的数据。 在示例性计算程序的每个间接跳转和程序上下文(例如,分支历史和/或路径信息)中,操作地,示例性预测引擎可以跟踪在表中的目标。 此外,预测引擎还可以存储与示例性表中的每个目标相关联的频率计数器。 说明性地,频率计数器可以记录在近期执行的观察到的一个或多个间接跳跃中目标被采取的次数。 预测引擎可以基于每个存储的目标地址的频率计数器的值来提供间接跳转的目标地址。

    Feedback mechanism for dynamic predication of indirect jumps
    2.
    发明授权
    Feedback mechanism for dynamic predication of indirect jumps 失效
    间接跳跃动态预测的反馈机制

    公开(公告)号:US07818551B2

    公开(公告)日:2010-10-19

    申请号:US11967336

    申请日:2007-12-31

    CPC classification number: G06F9/3844 G06F9/3842

    Abstract: Systems and methods are provided to detect instances where dynamic predication of indirect jumps (DIP) is considered to be ineffective utilizing data collected on the recent effectiveness of dynamic predication on recently executed indirect jump instructions. Illustratively, a computing environment comprises a DIP monitoring engine cooperating with a DIP monitoring table that aggregates and processes data representative of the effectiveness of DIP on recently executed jump instructions. Illustratively, the exemplary DIP monitoring engine collects and processes historical data on DIP instances, where, illustratively, a monitored instance can be categorized according to one or more selected classifications. A comparison can be performed for currently monitored indirect jump instructions using the collected historical data (and classifications) to determine whether DIP should be invoked by the computing environment or whether to invoke other indirect jump prediction paradigms.

    Abstract translation: 提供系统和方法来检测使用最近执行的间接跳转指令的动态预测最近有效性收集的数据被认为是间接跳变(DIP)的动态预测无效的情况。 说明性地,计算环境包括与DIP监视表协作的DIP监视引擎,其聚合并处理代表最近执行的跳转指令的DIP的有效性的数据。 示例性地,示例性DIP监视引擎收集和处理DIP实例上的历史数据,其中,示例性地,可以根据一个或多个所选择的分类对被监控的实例进行分类。 可以使用收集的历史数据(和分类)对当前监视的间接跳转指令进行比较,以确定DIP应由计算环境调用还是调用其他间接跳转预测范例。

    Reliable communications in on-chip networks
    3.
    发明授权
    Reliable communications in on-chip networks 有权
    片上网络的可靠通信

    公开(公告)号:US08473818B2

    公开(公告)日:2013-06-25

    申请号:US12577378

    申请日:2009-10-12

    CPC classification number: G06F13/4022 G06F15/17325 H04L1/18

    Abstract: Techniques for reliable communication in an on-chip network of a multi-core processor are provided. Packets are tagged with tags that define reliability requirements for the packets. The packets are routed in accordance with the reliability requirements. The reliability requirements and routing using them can ensure reliable communication in the on-chip network.

    Abstract translation: 提供了用于在多核处理器的片上网络中进行可靠通信的技术。 数据包被标记为定义数据包可靠性要求的标签。 数据包根据可靠性要求进行路由。 可靠性要求和使用它们的路由可以确保片上网络的可靠通信。

    RELIABLE COMMUNICATIONS IN ON-CHIP NETWORKS
    5.
    发明申请
    RELIABLE COMMUNICATIONS IN ON-CHIP NETWORKS 有权
    片上网络中可靠的通信

    公开(公告)号:US20110087943A1

    公开(公告)日:2011-04-14

    申请号:US12577378

    申请日:2009-10-12

    CPC classification number: G06F13/4022 G06F15/17325 H04L1/18

    Abstract: Techniques for reliable communication in an on-chip network of a multi-core processor are provided. Packets are tagged with tags that define reliability requirements for the packets. The packets are routed in accordance with the reliability requirements. The reliability requirements and routing using them can ensure reliable communication in the on-chip network.

    Abstract translation: 提供了用于在多核处理器的片上网络中进行可靠通信的技术。 数据包被标记为定义数据包可靠性要求的标签。 数据包根据可靠性要求进行路由。 可靠性要求和使用它们的路由可以确保片上网络的可靠通信。

    FEEDBACK MECHANISM FOR DYNAMIC PREDICATION OF INDIRECT JUMPS
    6.
    发明申请
    FEEDBACK MECHANISM FOR DYNAMIC PREDICATION OF INDIRECT JUMPS 失效
    用于动态预测间接JUMPS的反馈机制

    公开(公告)号:US20090172371A1

    公开(公告)日:2009-07-02

    申请号:US11967336

    申请日:2007-12-31

    CPC classification number: G06F9/3844 G06F9/3842

    Abstract: Systems and methods are provided to detect instances where dynamic predication of indirect jumps (DIP) is considered to be ineffective utilizing data collected on the recent effectiveness of dynamic predication on recently executed indirect jump instructions. Illustratively, a computing environment comprises a DIP monitoring engine cooperating with a DIP monitoring table that aggregates and processes data representative of the effectiveness of DIP on recently executed jump instructions. Illustratively, the exemplary DIP monitoring engine collects and processes historical data on DIP instances, where, illustratively, a monitored instance can be categorized according to one or more selected classifications. A comparison can be performed for currently monitored indirect jump instructions using the collected historical data (and classifications) to determine whether DIP should be invoked by the computing environment or whether to invoke other indirect jump prediction paradigms.

    Abstract translation: 提供系统和方法来检测使用最近执行的间接跳转指令的动态预测最近有效性收集的数据被认为是间接跳变(DIP)的动态预测无效的情况。 说明性地,计算环境包括与DIP监视表协作的DIP监视引擎,其聚合并处理代表最近执行的跳转指令的DIP的有效性的数据。 示例性地,示例性DIP监视引擎收集和处理DIP实例上的历史数据,其中,示例性地,可以根据一个或多个所选择的分类对被监控的实例进行分类。 可以使用收集的历史数据(和分类)对当前监视的间接跳转指令进行比较,以确定DIP应由计算环境调用还是调用其他间接跳转预测范例。

    Bufferless routing in on-chip interconnection networks
    7.
    发明授权
    Bufferless routing in on-chip interconnection networks 有权
    片上互连网络中的无缓冲路由

    公开(公告)号:US08509078B2

    公开(公告)日:2013-08-13

    申请号:US12370467

    申请日:2009-02-12

    CPC classification number: H04L45/00 H04L45/40 H04L49/109 H04L49/251

    Abstract: As microprocessors incorporate more and more devices on a single chip, dedicated buses have given way to on-chip interconnection networks (“OCIN”). Routers in a bufferless OCIN as described herein rank and prioritize flits. Flits traverse a productive path towards their destination or undergo temporary deflection to other non-productive paths, without buffering. Eliminating the buffers of on-chip routers reduces power consumption and heat dissipation while freeing up chip surface area for other uses. Furthermore, bufferless design enables purely local flow control of data between devices in the on-chip network, reducing router complexity and enabling reductions in router latency. Router latency reductions are possible in the bufferless on-chip routing by using lookahead links to send data between on-chip routers contemporaneously with flit traversals.

    Abstract translation: 随着微处理器在单个芯片上集成越来越多的器件,专用总线已经放弃了片上互连网络(“OCIN”)。 如本文所述的无缓冲区OCIN中的路由器对flits进行排序和优先级排序。 飞行器朝着目的地穿过生产路径,或者暂时偏转到其他非生产性路径,而不会缓冲。 消除片上路由器的缓冲区可以降低功耗和散热,同时释放芯片表面积用于其他用途。 此外,无缓冲设计可实现对片上网络设备之间的数据进行纯粹本地流量控制,从而降低路由器的复杂性并降低路由器延迟。 通过使用先行链接在片上路由器同时与flit遍历之间发送数据,在缓冲区片内路由中路由器延迟减少是可能的。

    Controlling interference in shared memory systems using parallelism-aware batch scheduling
    8.
    发明授权
    Controlling interference in shared memory systems using parallelism-aware batch scheduling 有权
    使用并行感知批处理调度来控制共享存储器系统中的干扰

    公开(公告)号:US08180975B2

    公开(公告)日:2012-05-15

    申请号:US12037102

    申请日:2008-02-26

    CPC classification number: G06F9/5016 G06F9/4881 G06F2209/485 G06F2209/5021

    Abstract: A “request scheduler” provides techniques for batching and scheduling buffered thread requests for access to shared memory in a general-purpose computer system. Thread-fairness is provided while preventing short- and long-term thread starvation by using “request batching.” Batching periodically groups outstanding requests from a memory request buffer into larger units termed “batches” that have higher priority than all other buffered requests. Each “batch” may include some maximum number of requests for each bank of the shared memory and for some or all concurrent threads. Further, average thread stall times are reduced by using computed thread rankings in scheduling request servicing from the shared memory. In various embodiments, requests from higher ranked threads are prioritized over requests from lower ranked threads. In various embodiments, a parallelism-aware memory access scheduling policy improves intra-thread bank-level parallelism. Further, rank-based request scheduling may be performed with or without batching.

    Abstract translation: “请求调度程序”提供了在通用计算机系统中批处理和调度缓冲线程请求以访问共享存储器的技术。 提供线程公平性,同时通过使用“请求批处理”来防止短期和长期线程饥饿。批处理周期性地将未完成的请求从存储器请求缓冲区分组成称为“批次”的更大单位,其优先级高于所有其他缓冲请求。 每个“批处理”可以包括共享存储器的每个存储体的一些最大数量的请求以及一些或所有的并发线程。 此外,通过在来自共享存储器的调度请求服务中使用计算的线程排名来减少平均线程停顿时间。 在各种实施例中,来自较高排名的线程的请求优先于来自较低等级的线程的请求。 在各种实施例中,并行感知存储器访问调度策略改善了线程内层级的并行性。 此外,基于等级的请求调度可以在分批处理或不进行批处理的情况下执行。

    CONTROLLING INTERFERENCE IN SHARED MEMORY SYSTEMS USING PARALLELISM-AWARE BATCH SCHEDULING
    9.
    发明申请
    CONTROLLING INTERFERENCE IN SHARED MEMORY SYSTEMS USING PARALLELISM-AWARE BATCH SCHEDULING 有权
    使用并行备份批处理调度控制共享内存系统的干扰

    公开(公告)号:US20090217273A1

    公开(公告)日:2009-08-27

    申请号:US12037102

    申请日:2008-02-26

    CPC classification number: G06F9/5016 G06F9/4881 G06F2209/485 G06F2209/5021

    Abstract: A “request scheduler” provides techniques for batching and scheduling buffered thread requests for access to shared memory in a general-purpose computer system. Thread-fairness is provided while preventing short- and long-term thread starvation by using “request batching.” Batching periodically groups outstanding requests from a memory request buffer into larger units termed “batches” that have higher priority than all other buffered requests. Each “batch” may include some maximum number of requests for each bank of the shared memory and for some or all concurrent threads. Further, average thread stall times are reduced by using computed thread rankings in scheduling request servicing from the shared memory. In various embodiments, requests from higher ranked threads are prioritized over requests from lower ranked threads. In various embodiments, a parallelism-aware memory access scheduling policy improves intra-thread bank-level parallelism. Further, rank-based request scheduling may be performed with or without batching.

    Abstract translation: “请求调度程序”提供了在通用计算机系统中批处理和调度缓冲线程请求以访问共享存储器的技术。 提供线程公平性,同时通过使用“请求批处理”来防止短期和长期的线程饥饿。 批处理周期性地将未完成的请求从存储器请求缓冲器分组成称为“批次”的较大单位,其优先级高于所有其他缓冲请求。 每个“批处理”可以包括共享存储器的每个存储体的一些最大数量的请求以及一些或所有的并发线程。 此外,通过在来自共享存储器的调度请求服务中使用计算的线程排名来减少平均线程停顿时间。 在各种实施例中,来自较高排名的线程的请求优先于来自较低等级的线程的请求。 在各种实施例中,并行感知存储器访问调度策略改善了线程内层级的并行性。 此外,基于等级的请求调度可以在分批处理或不进行批处理的情况下执行。

    PRIORITIZATION OF MULTIPLE CONCURRENT THREADS FOR SCHEDULING REQUESTS TO SHARED MEMORY
    10.
    发明申请
    PRIORITIZATION OF MULTIPLE CONCURRENT THREADS FOR SCHEDULING REQUESTS TO SHARED MEMORY 有权
    用于调度对共享存储器的要求的多路并入线程的优化

    公开(公告)号:US20090216962A1

    公开(公告)日:2009-08-27

    申请号:US12265514

    申请日:2008-11-05

    CPC classification number: G06F9/5016 G06F9/4881 G06F2209/485 G06F2209/5021

    Abstract: A “request scheduler” provides techniques for batching and scheduling buffered thread requests for access to shared memory in a general-purpose computer system. Thread-fairness is provided while preventing short- and long-term thread starvation by using “request batching.” Batching periodically groups outstanding requests from a memory request buffer into larger units termed “batches” that have higher priority than all other buffered requests. Each “batch” may include some maximum number of requests for each bank of the shared memory and for some or all concurrent threads. Further, average thread stall times are reduced by using computed thread rankings in scheduling request servicing from the shared memory. In various embodiments, requests from higher ranked threads are prioritized over requests from lower ranked threads. In various embodiments, a parallelism-aware memory access scheduling policy improves intra-thread bank-level parallelism. Further, rank-based request scheduling may be performed with or without batching.

    Abstract translation: “请求调度程序”提供了在通用计算机系统中批处理和调度缓冲线程请求以访问共享存储器的技术。 提供线程公平性,同时通过使用“请求批处理”来防止短期和长期的线程饥饿。 批处理周期性地将未完成的请求从存储器请求缓冲器分组成称为“批次”的较大单位,其优先级高于所有其他缓冲请求。 每个“批处理”可以包括共享存储器的每个存储体的一些最大数量的请求以及一些或所有的并发线程。 此外,通过在来自共享存储器的调度请求服务中使用计算的线程排名来减少平均线程停顿时间。 在各种实施例中,来自较高排名的线程的请求优先于来自较低等级的线程的请求。 在各种实施例中,并行感知存储器访问调度策略改善了线程内层级的并行性。 此外,基于等级的请求调度可以在分批处理或不进行批处理的情况下执行。

Patent Agency Ranking