BRANCH PREDICTION PRELOADING
    1.
    发明申请
    BRANCH PREDICTION PRELOADING 有权
    分行预测推广

    公开(公告)号:US20130339691A1

    公开(公告)日:2013-12-19

    申请号:US13517779

    申请日:2012-06-14

    IPC分类号: G06F9/38

    摘要: Embodiments relate to branch prediction preloading. An aspect includes a system for branch prediction preloading. The system includes an instruction cache and branch target buffer (BTB) coupled to a processing circuit, the processing circuit configured to perform a method. The method includes fetching a plurality of instructions in an instruction stream from the instruction cache, and decoding a branch prediction preload instruction in the instruction stream. An address of a predicted branch instruction is determined based on the branch prediction preload instruction. A predicted target address is determined based on the branch prediction preload instruction. A mask field is identified in the branch prediction preload instruction, and a branch instruction length is determined based on the mask field. Based on executing the branch prediction preload instruction, the BTB is preloaded with the address of the predicted branch instruction, the branch instruction length, the branch type, and the predicted target address.

    摘要翻译: 实施例涉及分支预测预加载。 一方面包括用于分支预测预加载的系统。 该系统包括耦合到处理电路的指令高速缓存和分支目标缓冲器(BTB),所述处理电路被配置为执行方法。 该方法包括从指令高速缓冲存储器中取出指令流中的多个指令,以及对指令流中的分支预测预加载指令进行解码。 基于分支预测预加载指令来确定预测转移指令的地址。 基于分支预测预加载指令来确定预测目标地址。 在分支预测预加载指令中识别掩码字段,并且基于掩码字段来确定分支指令长度。 基于执行分支预测预加载指令,BTB预先加载预测分支指令的地址,分支指令长度,分支类型和预测目标地址。

    Branch prediction preloading
    2.
    发明授权
    Branch prediction preloading 有权
    分支预测预加载

    公开(公告)号:US09146739B2

    公开(公告)日:2015-09-29

    申请号:US13517779

    申请日:2012-06-14

    IPC分类号: G06F9/30 G06F9/38

    摘要: Embodiments relate to branch prediction preloading. An aspect includes a system for branch prediction preloading. The system includes an instruction cache and branch target buffer (BTB) coupled to a processing circuit, the processing circuit configured to perform a method. The method includes fetching a plurality of instructions in an instruction stream from the instruction cache, and decoding a branch prediction preload instruction in the instruction stream. An address of a predicted branch instruction is determined based on the branch prediction preload instruction. A predicted target address is determined based on the branch prediction preload instruction. A mask field is identified in the branch prediction preload instruction, and a branch instruction length is determined based on the mask field. Based on executing the branch prediction preload instruction, the BTB is preloaded with the address of the predicted branch instruction, the branch instruction length, the branch type, and the predicted target address.

    摘要翻译: 实施例涉及分支预测预加载。 一方面包括用于分支预测预加载的系统。 该系统包括耦合到处理电路的指令高速缓存和分支目标缓冲器(BTB),所述处理电路被配置为执行方法。 该方法包括从指令高速缓冲存储器中取出指令流中的多个指令,以及对指令流中的分支预测预加载指令进行解码。 基于分支预测预加载指令来确定预测分支指令的地址。 基于分支预测预加载指令来确定预测目标地址。 在分支预测预加载指令中识别掩码字段,并且基于掩码字段来确定分支指令长度。 基于执行分支预测预加载指令,BTB预先加载预测分支指令的地址,分支指令长度,分支类型和预测目标地址。

    Post-register allocation profile directed instruction scheduling
    3.
    发明申请
    Post-register allocation profile directed instruction scheduling 有权
    注册后分配简档指导调度

    公开(公告)号:US20070150880A1

    公开(公告)日:2007-06-28

    申请号:US11320220

    申请日:2005-12-28

    IPC分类号: G06F9/45

    CPC分类号: G06F8/441 G06F8/445

    摘要: A computer implemented method, system, and computer usable program code for selective instruction scheduling. A determination is made whether a region of code exceeds a modification threshold after performing register allocation on the region of code. The region of code is marked as a modified region of code in response to the determination that the region of code exceeds the modification threshold. A determination is made whether the region of code exceeds an execution threshold in response to the determination that the region of code is marked as a modified region of code. Post-register allocation instruction scheduling is performed on the region of code in response to the determination that the region of code is marked as a modified region of code and the determination that the region of code exceeds the execution threshold.

    摘要翻译: 用于选择性指令调度的计算机实现的方法,系统和计算机可用程序代码。 在对代码区域执行寄存器分配之后,确定代码区域是否超过修改阈值。 响应于确定代码区域超过修改阈值,代码区域被标记为修改的代码区域。 响应于确定代码区域被标记为修改的代码区域,确定代码区域是否超过执行阈值。 响应于确定代码区域被标记为修改的代码区域以及代码区域超过执行阈值的确定,对代码区域执行寄存器分配指令调度。

    METHODS, SYSTEMS, AND COMPUTER PRODUCTS FOR EVALUATING ROBUSTNESS OF A LIST SCHEDULING FRAMEWORK
    4.
    发明申请
    METHODS, SYSTEMS, AND COMPUTER PRODUCTS FOR EVALUATING ROBUSTNESS OF A LIST SCHEDULING FRAMEWORK 失效
    用于评估列表调度框架的鲁棒性的方法,系统和计算机产品

    公开(公告)号:US20090064109A1

    公开(公告)日:2009-03-05

    申请号:US11845133

    申请日:2007-08-27

    IPC分类号: G06F9/44

    CPC分类号: G06F11/3612

    摘要: Systems, methods, and computer products for evaluating robustness of a list scheduling framework. Exemplary embodiments include a method for evaluating the robustness of a list scheduling framework, the method including identifying a set of compiler benchmarks known to be sensitive to an instruction scheduler, running the set of benchmarks against a heuristic under test, H and collect an execution time Exec(H[G]), where G is a directed a-cyclical graph, running the set of benchmarks against a plurality of random heuristics Hrand[G]i, and collect a plurality of respective execution times Exec(Hrand[G])i, computing a robustness of the list scheduling framework, and checking robustness check it against a pre-determined threshold.

    摘要翻译: 用于评估列表调度框架的鲁棒性的系统,方法和计算机产品。 示例性实施例包括用于评估列表调度框架的鲁棒性的方法,所述方法包括识别已知对指令调度器敏感的一组编译器基准,针对所测试的启发式H运行所述一组基准,并收集执行时间 Exec(H [G]),其中G是有向的a循环图,针对多个随机启发式Hrand [G] i运行一组基准,并且收集多个相应的执行时间Exec(Hrand [G]) i,计算列表调度框架的鲁棒性,并检查鲁棒性,以预先确定的阈值进行检查。

    Configuring a dependency graph for dynamic by-pass instruction scheduling
    5.
    发明授权
    Configuring a dependency graph for dynamic by-pass instruction scheduling 失效
    为动态旁路指令调度配置依赖图

    公开(公告)号:US08250557B2

    公开(公告)日:2012-08-21

    申请号:US12116563

    申请日:2008-05-07

    IPC分类号: G06F9/45

    CPC分类号: G06F8/445 G06F8/4451

    摘要: There is disclosed a method and system for configuring a data dependency graph (DDG) to handle instruction scheduling in computer architectures permitting dynamic by-pass execution, and for performing dynamic by-pass scheduling utilizing such a configured DDG. In accordance with an embodiment of the invention, a heuristic function is used to obtain a ranking of nodes in the DDG after setting delays at all identified by-pass pairs of nodes in the DDG to 0. From among a list of identified by-pass pairs of nodes, a node that is identified as being the least important to schedule early is marked as “bonded” to its successor, and the corresponding delay for that identified node is set to 0. Node rankings are re-computed and the bonded by-pass pair of nodes are scheduled in consecutive execution cycles with a delay of 0 to increase the likelihood that a by-pass can be successfully taken during run-time execution.

    摘要翻译: 公开了一种用于配置数据依赖图(DDG)以处理允许动态旁路执行的计算机体系结构中的指令调度以及利用这种配置的DDG进行动态旁路调度的方法和系统。 根据本发明的实施例,启发式函数用于在DDG中的所有已识别的旁路节点对之间设置延迟之后获得DDG中的节点的等级。从所识别的旁路列表中, 节点对,被识别为提前调度最不重要的节点被标记为与其后继者“绑定”,并且将该标识节点的相应延迟设置为0.节点排名被重新计算,并且由 - 连续的节点对在延迟为0的连续执行周期中调度,以增加在运行时执行期间可以成功执行旁路的可能性。

    Methods, systems, and computer products for evaluating robustness of a list scheduling framework
    6.
    发明授权
    Methods, systems, and computer products for evaluating robustness of a list scheduling framework 失效
    用于评估列表调度框架的鲁棒性的方法,系统和计算机产品

    公开(公告)号:US08042100B2

    公开(公告)日:2011-10-18

    申请号:US11845133

    申请日:2007-08-27

    IPC分类号: G06F9/44

    CPC分类号: G06F11/3612

    摘要: Systems, methods, and computer products for evaluating robustness of a list scheduling framework. Exemplary embodiments include a method for evaluating the robustness of a list scheduling framework, the method including identifying a set of compiler benchmarks known to be sensitive to an instruction scheduler, running the set of benchmarks against a heuristic under test, H and collect an execution time Exec(H[G]), where G is a directed a-cyclical graph, running the set of benchmarks against a plurality of random heuristics Hrand[G]i, and collect a plurality of respective execution times Exec(Hrand[G])i, computing a robustness of the list scheduling framework, and checking robustness check it against a pre-determined threshold.

    摘要翻译: 用于评估列表调度框架的鲁棒性的系统,方法和计算机产品。 示例性实施例包括用于评估列表调度框架的鲁棒性的方法,所述方法包括识别已知对指令调度器敏感的一组编译器基准,针对所测试的启发式H运行所述一组基准,并收集执行时间 Exec(H [G]),其中G是有向的a循环图,针对多个随机启发式Hrand [G] i运行一组基准,并且收集多个相应的执行时间Exec(Hrand [G]) i,计算列表调度框架的鲁棒性,并检查鲁棒性,以预先确定的阈值进行检查。

    METHODS AND COMPUTER PROGRAM PRODUCTS FOR REDUCING LOAD-HIT-STORE DELAYS BY ASSIGNING MEMORY FETCH UNITS TO CANDIDATE VARIABLES
    7.
    发明申请
    METHODS AND COMPUTER PROGRAM PRODUCTS FOR REDUCING LOAD-HIT-STORE DELAYS BY ASSIGNING MEMORY FETCH UNITS TO CANDIDATE VARIABLES 审中-公开
    方法和计算机程序产品,用于通过分配存储器单元来替代变量来减少负载休眠延迟

    公开(公告)号:US20090055628A1

    公开(公告)日:2009-02-26

    申请号:US11842289

    申请日:2007-08-21

    IPC分类号: G06F9/312

    摘要: Assigning each of a plurality of memory fetch units to any of a plurality of candidate variables to reduce load-hit-store delays, wherein a total number of required memory fetch units is minimized. A plurality of store/load pairs are identified. A dependency graph is generated by creating a node Nx for each store to variable X and a node Ny for each load of variable Y and, unless X=Y, for each store/load pair, creating an edge between a respective node Nx and a corresponding node Ny; for each created edge, labeling the edge with a heuristic weight; labeling each node Nx with a node weight Wx that combines a plurality of respective edge weights of a plurality of corresponding nodes Nx such that Wx=Σωxj; and determining a color for each of the graph nodes using k distinct colors wherein k is minimized such that no adjacent nodes joined by an edge between a respective node Nx and a corresponding node Ny have an identical color; and assigning a memory fetch unit to each of the k distinct colors.

    摘要翻译: 将多个存储器提取单元中的每一个分配给多个候选变量中的任何一个以减少加载命中存储器延迟,其中所需的存储器提取单元的总数最小化。 识别多个存储/负载对。 通过为每个商店创建一个节点Nx到变量X和每个负载变量Y的节点Ny,并且除了X = Y之外,针对每个商店/负载对,在相应的节点Nx和 对应节点Ny; 对于每个创建的边缘,用启发式权重标记边缘; 使用组合多个对应节点Nx的多个相应边缘权重的节点权重Wx来标记每个节点Nx,使得Wx = Sigmaomegaxj; 并且使用k个不同的颜色来确定每个图形节点的颜色,其中k被最小化,使得没有由相应节点Nx和对应节点Ny之间的边缘连接的相邻节点具有相同的颜色; 以及将存储器提取单元分配给k个不同颜色中的每一个。

    Method for Configuring a Dependency Graph for Dynamic By-Pass Instruction Scheduling
    8.
    发明申请
    Method for Configuring a Dependency Graph for Dynamic By-Pass Instruction Scheduling 失效
    配置动态旁路指令调度的依赖关系图的方法

    公开(公告)号:US20080216062A1

    公开(公告)日:2008-09-04

    申请号:US12116563

    申请日:2008-05-07

    IPC分类号: G06F9/45

    CPC分类号: G06F8/445 G06F8/4451

    摘要: There is disclosed a method and system for configuring a data dependency graph (DDG) to handle instruction scheduling in computer architectures permitting dynamic by-pass execution, and for performing dynamic by-pass scheduling utilizing such a configured DDG. In accordance with an embodiment of the invention, a heuristic function is used to obtain a ranking of nodes in the DDG after setting delays at all identified by-pass pairs of nodes in the DDG to 0. From among a list of identified by-pass pairs of nodes, a node that is identified as being the least important to schedule early is marked as “bonded” to its successor, and the corresponding delay for that identified node is set to 0. Node rankings are re-computed and the bonded by-pass pair of nodes are scheduled in consecutive execution cycles with a delay of 0 to increase the likelihood that a by-pass can be successfully taken during run-time execution.

    摘要翻译: 公开了一种用于配置数据依赖图(DDG)以处理允许动态旁路执行的计算机体系结构中的指令调度以及利用这种配置的DDG进行动态旁路调度的方法和系统。 根据本发明的实施例,启发式函数用于在DDG中的所有已识别的旁路节点对之间设置延迟之后获得DDG中的节点的等级。从所识别的旁路列表中, 节点对,被识别为提前调度最不重要的节点被标记为与其后继者“绑定”,并且将该标识节点的相应延迟设置为0.节点排名被重新计算,并且由 - 连续的节点对在延迟为0的连续执行周期中调度,以增加在运行时执行期间可以成功执行旁路的可能性。

    Method and system for configuring a dependency graph for dynamic by-pass instruction scheduling
    9.
    发明授权
    Method and system for configuring a dependency graph for dynamic by-pass instruction scheduling 有权
    用于配置动态旁路指令调度的依赖图的方法和系统

    公开(公告)号:US07392516B2

    公开(公告)日:2008-06-24

    申请号:US10912482

    申请日:2004-08-05

    IPC分类号: G06F9/45

    CPC分类号: G06F8/445 G06F8/4451

    摘要: There is disclosed a method and system for configuring a data dependency graph (DDG) to handle instruction scheduling in computer architectures permitting dynamic by-pass execution, and for performing dynamic by-pass scheduling utilizing such a configured DDG. In accordance with an embodiment of the invention, a heuristic function is used to obtain a ranking of nodes in the DDG after setting delays at all identified by-pass pairs of nodes in the DDG to 0. From among a list of identified by-pass pairs of nodes, a node that is identified as being the least important to schedule early is marked as “bonded” to its successor, and the corresponding delay for that identified node is set to 0. Node rankings are re-computed and the bonded by-pass pair of nodes are scheduled in consecutive execution cycles with a delay of 0 to increase the likelihood that a by-pass can be successfully taken during run-time execution.

    摘要翻译: 公开了一种用于配置数据依赖图(DDG)以处理允许动态旁路执行的计算机体系结构中的指令调度以及利用这种配置的DDG进行动态旁路调度的方法和系统。 根据本发明的实施例,启发式函数用于在DDG中的所有已识别的旁路节点对之间设置延迟之后获得DDG中的节点的等级。从所识别的旁路列表中, 节点对,被识别为提前调度最不重要的节点被标记为与其后继者“绑定”,并且将该标识节点的相应延迟设置为0.节点排名被重新计算,并且由 - 连续的节点对在延迟为0的连续执行周期中调度,以增加在运行时执行期间可以成功执行旁路的可能性。

    Automatic inspection of compiled code
    10.
    发明授权
    Automatic inspection of compiled code 有权
    自动检查编译代码

    公开(公告)号:US07908596B2

    公开(公告)日:2011-03-15

    申请号:US11620157

    申请日:2007-01-05

    IPC分类号: G06F9/45

    CPC分类号: G06F11/3688

    摘要: Automatic inspection of compiled code. In response to revising a compiler, the functionality of that compiler is verified. Specific code is compiled using a first version of the compiler, as well as a second version of the compiler. Each compiled code is then applied to machine state to obtain multiple machine states. The machine states are then compared to determine if they are equal.

    摘要翻译: 自动检查编译代码。 响应修改编译器,该编译器的功能被验证。 使用编译器的第一个版本以及编译器的第二个版本来编译特定的代码。 然后将每个编译的代码应用于机器状态以获得多个机器状态。 然后比较机器状态以确定它们是否相等。