Determination of loop unrolling factor for software loops
    1.
    发明申请
    Determination of loop unrolling factor for software loops 审中-公开
    确定软件循环的循环展开因子

    公开(公告)号:US20050283772A1

    公开(公告)日:2005-12-22

    申请号:US10874614

    申请日:2004-06-22

    IPC分类号: G06F9/45

    CPC分类号: G06F8/4452

    摘要: Disclosed are embodiments of a method and system for calculating an unrolling factor for software loops. The unrolling factor may be calculated by applying a formula that takes into account issue constraints of a processor. The issue constraints may include the total issue width of the processor, and may also include individual issue constraints for each instruction type. The software loop may be unrolled by the calculated unrolling factor and may be software pipelined. Other embodiments are also described and claimed.

    摘要翻译: 公开了用于计算软件循环的展开因子的方法和系统的实施例。 可以通过应用考虑到处理器的问题约束的公式来计算展开因子。 问题约束可以包括处理器的总发行宽度,并且还可以包括针对每个指令类型的单独的问题约束。 软件循环可以通过计算的展开因子展开,并且可以是软件流水线的。 还描述和要求保护其他实施例。

    System, method, and apparatus for spilling and filling rotating registers in software-pipelined loops
    5.
    发明申请
    System, method, and apparatus for spilling and filling rotating registers in software-pipelined loops 失效
    用于在软件流水线循环中溢出和填充旋转寄存器的系统,方法和装置

    公开(公告)号:US20050071607A1

    公开(公告)日:2005-03-31

    申请号:US10673741

    申请日:2003-09-29

    申请人: Kalyan Muthukumar

    发明人: Kalyan Muthukumar

    IPC分类号: G06F9/30 G06F9/45 G06F15/00

    摘要: An efficient method for software-pipelining (SWP) of loops to translate programs, from higher level languages into equivalent object or machine language code for execution on a computer. In one example embodiment, this is accomplished by spilling and filling multiple computed values, in a register, that are live across multiple stages in a software-pipelined loop, using multiple rotating stack memory locations to reduce compiler-time of SWP, and complexity of the implemented SWP.

    摘要翻译: 一种循环软件流水线(SWP)的有效方法,用于将程序从较高级别的语言翻译成等效对象或机器语言代码,以便在计算机上执行。 在一个示例实施例中,这是通过使用多个旋转堆栈存储器位置溢出并填充在寄存器中的多个计算值来实现的,所述多个计算值在软件流水线循环中的多个阶段处于活动状态,以减少SWP的编译器时间和复杂度 实施的SWP。

    ENERGY OPTIMIZATION TECHNIQUES IN A COMPUTING SYSTEM
    6.
    发明申请
    ENERGY OPTIMIZATION TECHNIQUES IN A COMPUTING SYSTEM 有权
    计算机系统能源优化技术

    公开(公告)号:US20120089852A1

    公开(公告)日:2012-04-12

    申请号:US13018810

    申请日:2011-02-01

    IPC分类号: G06F1/32

    摘要: A computing platform may include components to determine performance loss values and energy savings values for each of the plurality of regions and/or the memory boundedness value of each of a plurality of regions within an application. The computing platform may provide a user interface for a user to provide a user input, which provides an indication of an acceptable performance loss. For the provided performance loss value, the frequency values may be determined and the processing element may be operated at the frequency values while processing each of the plurality of regions.

    摘要翻译: 计算平台可以包括用于确定多个区域中的每一个的性能损失值和能量节省值和/或应用程序内的多个区域中的每一个的存储器有界值的组件。 计算平台可以为用户提供用户界面以提供用户输入,其提供可接受的性能损失的指示。 为了提供的性能损失值,可以确定频率值,并且可以在处理多个区域中的每个区域的同时以频率值操作处理元件。

    METHOD AND SYSTEM FOR PARALLEL EXECUTION OF MEMORY INSTRUCTIONS IN AN IN-ORDER PROCESSOR
    7.
    发明申请
    METHOD AND SYSTEM FOR PARALLEL EXECUTION OF MEMORY INSTRUCTIONS IN AN IN-ORDER PROCESSOR 审中-公开
    在订单处理器中并行执行存储器指令的方法和系统

    公开(公告)号:US20100077145A1

    公开(公告)日:2010-03-25

    申请号:US12238341

    申请日:2008-09-25

    IPC分类号: G06F12/08 G06F9/30

    摘要: A method of parallel execution of a first and a second instruction in an in-order processor. Embodiments of the invention enable parallel execution of memory instructions that are stalled by cache memory misses. The in-order processor processes cache memory misses of instructions in parallel by overlapping the first cache memory miss with cache memory misses that occur after the first cache memory miss. Memory-level parallelism in the in-order processor can be increased when more parallel and outstanding cache memory misses are generated.

    摘要翻译: 一种在顺序处理器中并行执行第一和第二指令的方法。 本发明的实施例使得能够并行执行由高速缓存存储器未命中停滞的存储器指令。 按顺序处理器通过将第一高速缓存存储器未命中与在第一高速缓冲存储器未命中之后发生的高速缓存存储器未命中重叠来并行处理高速缓存存储器未命中。 当产生更多并行和未完成的高速缓存存储器未命中时,可以增加按顺序处理器中的存储器级并行性。

    Method for predicate promotion in a software loop
    8.
    发明申请
    Method for predicate promotion in a software loop 失效
    软件循环中谓词升级的方法

    公开(公告)号:US20070079302A1

    公开(公告)日:2007-04-05

    申请号:US11241144

    申请日:2005-09-30

    IPC分类号: G06F9/45

    CPC分类号: G06F8/443 G06F8/433

    摘要: A method and system for optimizing the execution of a software loop is provided. The method involves the determination of an edge in a critical recurrence cycle in the software loop. The edge is a dependency link between two instructions and contains a dependee and a dependent. The dependee is an instruction that produces a result, and the dependent is an instruction that uses the result. The method further involves performing predicate promotion of at least one of the dependee and the dependent if one or more pre-determined conditions are met.

    摘要翻译: 提供了一种用于优化软件循环执行的方法和系统。 该方法涉及在软件循环中确定临界循环中的边缘。 边缘是两条指令之间的依赖关系,包含一个依赖关系和一个依赖关系。 依赖者是产生结果的指令,依赖关系是使用结果的指令。 该方法还涉及如果满足一个或多个预定条件,则执行至少一个依赖者和依赖者的谓词升级。

    Method of, system for, and computer program product for providing efficient utilization of memory hierarchy through code restructuring
    9.
    发明授权
    Method of, system for, and computer program product for providing efficient utilization of memory hierarchy through code restructuring 有权
    方法,系统和计算机程序产品,通过代码重组提供有效利用存储器层次结构

    公开(公告)号:US06839895B1

    公开(公告)日:2005-01-04

    申请号:US09685481

    申请日:2000-10-10

    IPC分类号: G06F9/45

    CPC分类号: G06F8/4442

    摘要: Code restructuring or reordering based on profiling information and memory hierarchy is provided by constructing a Program Execution Graph (PEG) corresponding to a level of the memory hierarchy, partitioning this PEG to reduce estimated memory overhead costs below an upper bound, and constructing a PEG for a next level of the memory hierarchy from the partitioned PEG. The PEG is constructed from control flow and frequency information from a profile of the program to be restructured. The PEG is a weighted undirected graph comprising nodes representing basic blocks and edges representing transfer of control between pairs of basic blocks. The weight of a node is the size of the basic block it represents and the weight of an edge is the frequency of transition between the pair of basic blocs it connects.

    摘要翻译: 通过构建对应于存储器层级的级别的程序执行图(PEG)来提供基于分析信息和存储器层次结构的代码重构或重新排序,对该PEG进行划分以减少低于上限的估计的内存开销成本,以及构建用于 来自分区PEG的内存层次结构的下一级别。 PEG由要重组程序的配置文件的控制流程和频率信息构成。 PEG是加权无向图,包括表示基本块的节点和表示基本块对之间的控制传输的边。 节点的权重是其表示的基本块的大小,边的权重是其连接的一对基本块之间的转换频率。

    Early exit transformations for software pipelining
    10.
    发明授权
    Early exit transformations for software pipelining 有权
    软件流水线的早期退出转换

    公开(公告)号:US06571385B1

    公开(公告)日:2003-05-27

    申请号:US09273947

    申请日:1999-03-22

    IPC分类号: G06F944

    摘要: The invention is directed to the transformation of software loops having early exit conditions, thereby allowing the loops to be more effectively converted to a single basic block for software pipelining. The invention assigns a predicate register for each early exit condition of the software loop. The predicate registers are set when the corresponding early exit condition is satisfied. In this manner, when the loop terminates the predicate registers can be examined to indicate which early exit conditions were satisfied. The invention produces loops having a lower recurrence II and resource II than conventional techniques.

    摘要翻译: 本发明涉及具有早期退出条件的软件循环的变换,从而允许循环更有效地转换成用于软件流水线化的单个基本块。 本发明为软件循环的每个提前退出条件分配谓词寄存器。 当满足相应的提前退出条件时,设定谓词寄存器。 以这种方式,当循环终止时,可以检查谓词寄存器以指示哪个早期退出条件被满足。 本发明产生具有比常规技术更低的复发II和资源II的环。