Mechanism to restrict parallelization of loops
    1.
    发明申请
    Mechanism to restrict parallelization of loops 失效
    限制环路并行化的机制

    公开(公告)号:US20070169057A1

    公开(公告)日:2007-07-19

    申请号:US11314456

    申请日:2005-12-21

    IPC分类号: G06F9/45

    CPC分类号: G06F8/4452

    摘要: A computer implemented method, computer usable program code, and a system for parallelizing a loop. A parameter that will be used to limit parallelization of the loop is identified to limit parallelization of the loop. The parameter specifies a minimum number of loop iterations that a thread should execute. The parameter can be adjusted based on a parallel performance factor. A parallel performance factor is a factor that influences the performance of parallel code. A number of threads from a plurality of threads is selected for processing iterations of the loop based on the parameter. The number of threads is selected prior to execution of the first iteration of the loop.

    摘要翻译: 计算机实现的方法,计算机可用程序代码和用于并行化循环的系统。 确定用于限制环路并联的参数,以限制环路的并行化。 该参数指定线程应执行的最小循环迭代次数。 该参数可以根据并行性能因素进行调整。 并行性能因素是影响并行代码性能的因素。 选择来自多个线程的多个线程用于基于该参数来处理该循环的迭代。 在执行循环的第一次迭代之前选择线程数。

    Mechanism to restrict parallelization of loops
    2.
    发明授权
    Mechanism to restrict parallelization of loops 失效
    限制环路并行化的机制

    公开(公告)号:US08104030B2

    公开(公告)日:2012-01-24

    申请号:US11314456

    申请日:2005-12-21

    IPC分类号: G06F9/44 G06F9/45

    CPC分类号: G06F8/4452

    摘要: A computer implemented method, computer usable program code, and a system for parallelizing a loop. A parameter that will be used to limit parallelization of the loop is identified to limit parallelization of the loop. The parameter specifies a minimum number of loop iterations that a thread should execute. The parameter can be adjusted based on a parallel performance factor. A parallel performance factor is a factor that influences the performance of parallel code. A number of threads from a plurality of threads is selected for processing iterations of the loop based on the parameter. The number of threads is selected prior to execution of the first iteration of the loop.

    摘要翻译: 计算机实现的方法,计算机可用程序代码和用于并行化循环的系统。 确定用于限制环路并联的参数,以限制环路的并行化。 该参数指定线程应执行的最小循环迭代次数。 该参数可以根据并行性能因素进行调整。 并行性能因素是影响并行代码性能的因素。 选择来自多个线程的多个线程用于基于该参数来处理该循环的迭代。 在执行循环的第一次迭代之前选择线程数。

    Method and system for auto parallelization of zero-trip loops through induction variable substitution

    公开(公告)号:US20060048119A1

    公开(公告)日:2006-03-02

    申请号:US10926594

    申请日:2004-08-26

    IPC分类号: G06F9/45

    CPC分类号: G06F8/443 G06F8/452

    摘要: A method and system of auto parallelization of zero-trip loops that substitutes a nested basic linear induction variable by exploiting a parallelizing compiler is provided. Provided is a use of a max{0,N} variable for loop iterations in case of no information is known about the value of N, for a typical loop iterating from 1 to N, in which N is the loop invariant. For the nested basic induction variables, an induction variable substitution process is applied to the nested loops starting from the innermost loop to the outermost one. Then a removal of the max operator afterwards through a copy propagation pass of the IBM compiler is provided. In doing so, the loop dependency on the induction variable is eliminated and an opportunity for a parallelizing compiler to parallel the outermost loop is provided.

    Distributed counter and centralized sensor in barrier wait synchronization

    公开(公告)号:US20060048147A1

    公开(公告)日:2006-03-02

    申请号:US10929165

    申请日:2004-08-30

    IPC分类号: G06F9/46

    CPC分类号: G06F9/52 G06F9/522

    摘要: A method, system and apparatus for barrier synchronization using distributed counters and a centralized sensor. The system can include multiple distributed counters coupled to corresponding application processes in a computing application. The barrier synchronization system further can include a centralized sensor coupled for observation by the application processes. Preferably, the application processes can be separate threads of execution in the computing application. The barrier synchronization centralized sensor yet further can be managed by a designated master one of the application processes. Moreover, preferably the system further can include a backup sensor coupled for observation by the application processes and managed by the designated master one of the application processes.

    Promotion of a child procedure in heterogeneous architecture software
    5.
    发明授权
    Promotion of a child procedure in heterogeneous architecture software 有权
    在异构架构软件中促进子程序

    公开(公告)号:US08527962B2

    公开(公告)日:2013-09-03

    申请号:US12400840

    申请日:2009-03-10

    IPC分类号: G06F9/45

    CPC分类号: G06F8/52

    摘要: A method for promotion of a child procedure in a software application for a heterogeneous architecture, wherein the heterogeneous architecture comprises a first architecture type and a second architecture type, comprises inserting a parameter representing a parallel frame pointer to a parent procedure of the child procedure into the child procedure; and modifying a reference in the child procedure to a stack variable of the parent procedure to include an indirect access to the parent procedure via the parallel frame pointer.

    摘要翻译: 一种用于在异构架构的软件应用程序中促进子程序的方法,其中异构架构包括第一架构类型和第二架构类型,包括将表示并行帧指针的参数插入到子程序的父过程中 子程序; 以及将子过程中的引用修改为父过程的堆栈变量,以通过并行帧指针包括对父过程的间接访问。

    Structure and method for managing workshares in a parallel region
    6.
    发明申请
    Structure and method for managing workshares in a parallel region 审中-公开
    并行区管理工作的结构和方法

    公开(公告)号:US20050080981A1

    公开(公告)日:2005-04-14

    申请号:US10845553

    申请日:2004-05-13

    CPC分类号: G06F9/5066

    摘要: A data processing system is adapted to execute at least one workshare construct in a parallel region. The data processing system uses at least one thread for executing a corresponding subsection of the workshare construct and provides control blocks for managing corresponding workshare constructs in the parallel region. A method of managing the control blocks comprises: adding an array of control blocks to a control block queue; assigning control blocks in the initialized array to corresponding workshare constructs in the parallel region until a barrier is reached; and waiting at the barrier for all threads in the parallel region to complete their corresponding subsections and then resetting the control block to the beginning of the control block queue. Also provided are a computer program product and a data processing system for implementing the method.

    摘要翻译: 数据处理系统适于在并行区域中执行至少一个作业分配构造。 数据处理系统使用至少一个线程来执行工作共享结构的相应子部分,并且提供用于管理并行区域中的相应作业分配构造的控制块。 一种管理控制块的方法包括:将一组控制块添加到控制块队列; 将初始化的数组中的控制块分配给并行区域中的相应的工作区构造,直到达到屏障; 并且在并行区域中的所有线程等待屏障以完成其对应的子部分,然后将控制块重置为控制块队列的开头。 还提供了一种用于实现该方法的计算机程序产品和数据处理系统。

    Promotion of a Child Procedure in Heterogeneous Architecture Software
    7.
    发明申请
    Promotion of a Child Procedure in Heterogeneous Architecture Software 有权
    促进异构体系结构软件中的子程序

    公开(公告)号:US20100235811A1

    公开(公告)日:2010-09-16

    申请号:US12400840

    申请日:2009-03-10

    IPC分类号: G06F9/44 G06F9/45

    CPC分类号: G06F8/52

    摘要: A method for promotion of a child procedure in a software application for a heterogeneous architecture, wherein the heterogeneous architecture comprises a first architecture type and a second architecture type, comprises inserting a parameter representing a parallel frame pointer to a parent procedure of the child procedure into the child procedure; and modifying a reference in the child procedure to a stack variable of the parent procedure to include an indirect access to the parent procedure via the parallel frame pointer.

    摘要翻译: 一种用于在异构架构的软件应用程序中促进子程序的方法,其中异构架构包括第一架构类型和第二架构类型,包括将表示并行帧指针的参数插入到子程序的父过程中 子程序; 以及将子过程中的引用修改为父过程的堆栈变量,以通过并行帧指针包括对父过程的间接访问。

    Single instruction multiple data (SIMD) code generation for parallel loops using versioning and scheduling
    8.
    发明授权
    Single instruction multiple data (SIMD) code generation for parallel loops using versioning and scheduling 失效
    使用版本控制和调度的并行循环的单指令多数据(SIMD)代码生成

    公开(公告)号:US08341615B2

    公开(公告)日:2012-12-25

    申请号:US12172199

    申请日:2008-07-11

    IPC分类号: G06F9/45 G06F9/46

    CPC分类号: G06F8/456

    摘要: Embodiments of the present invention address deficiencies of the art in respect to loop parallelization for a target architecture implementing a shared memory model and provide a novel and non-obvious method, system and computer program product for SIMD code generation for parallel loops using versioning and scheduling. In an embodiment of the invention, within a code compilation data processing system a parallel SIMD loop code generation method can include identifying a loop in a representation of source code as a parallel loop candidate, either through a user directive or through auto-parallelization. The method also can include selecting a trip count condition responsive to a scheduling policy set for the code compilation data processing system and also on a minimal simdizable threshold, determining a trip count and an alignment constraint for the selected loop, and generating a version of a parallel loop in the source code according to the alignment constraint and a comparison of the trip count to the trip count condition.

    摘要翻译: 本发明的实施例解决了实现共享存储器模型的目标架构的环路并行化方面的技术缺陷,并且提供了一种用于使用版本控制和调度的并行循环的SIMD代码生成的新颖且非显而易见的方法,系统和计算机程序产品 。 在本发明的一个实施例中,在代码编译数据处理系统中,并行SIMD循环码生成方法可以包括通过用户指令或通过自动并行化来将源代码表示中的循环识别为并行循环候选。 该方法还可以包括响应于针对代码编译数据处理系统的调度策略集以及最小可仿真阈值来选择跳闸计数条件,确定所选循环的跳闸计数和对准约束,以及生成 根据对齐约束在源代码中并行循环,并将行程计数与行程计数条件进行比较。

    VIRTUAL MEMORY PROTOCOL SEGMENTATION OFFLOADING
    9.
    发明申请
    VIRTUAL MEMORY PROTOCOL SEGMENTATION OFFLOADING 有权
    虚拟内存协议分段卸载

    公开(公告)号:US20090304029A1

    公开(公告)日:2009-12-10

    申请号:US12254931

    申请日:2008-10-21

    IPC分类号: H04J3/24 G06F12/00

    摘要: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing segmentation offloading, such as TCP segmentation offloading (TSO). An interface performs direct virtual memory addressing of a user memory space of a system memory on behalf of a network processor to fetch payload data originated by a user process running on a host processor. Then, the network processor segments the payload data across one or more packets.

    摘要翻译: 提供了更有效地传输网络流量的方法和系统。 根据一个实施例,提供了一种用于执行诸如TCP分段卸载(TSO)的分段卸载的方法。 接口代表网络处理器执行对系统存储器的用户存储器空间的直接虚拟存储器寻址,以提取由主机处理器上运行的用户进程发起的有效载荷数据。 然后,网络处理器通过一个或多个分组分段有效载荷数据。

    SINGLE INSTRUCTION MULTIPLE DATA (SIMD) CODE GENERATION FOR PARALLEL LOOPS USING VERSIONING AND SCHEDULING
    10.
    发明申请
    SINGLE INSTRUCTION MULTIPLE DATA (SIMD) CODE GENERATION FOR PARALLEL LOOPS USING VERSIONING AND SCHEDULING 失效
    单一指令多项数据(SIMD)使用版本和调度的平行代码生成代码

    公开(公告)号:US20100011339A1

    公开(公告)日:2010-01-14

    申请号:US12172199

    申请日:2008-07-11

    IPC分类号: G06F9/44

    CPC分类号: G06F8/456

    摘要: Embodiments of the present invention address deficiencies of the art in respect to loop parallelization for a target architecture implementing a shared memory model and provide a novel and non-obvious method, system and computer program product for SIMD code generation for parallel loops using versioning and scheduling. In an embodiment of the invention, within a code compilation data processing system a parallel SIMD loop code generation method can include identifying a loop in a representation of source code as a parallel loop candidate, either through a user directive or through auto-parallelization. The method also can include selecting a trip count condition responsive to a scheduling policy set for the code compilation data processing system and also on a minimal simdizable threshold, determining a trip count and an alignment constraint for the selected loop, and generating a version of a parallel loop in the source code according to the alignment constraint and a comparison of the trip count to the trip count condition.

    摘要翻译: 本发明的实施例解决了实现共享存储器模型的目标架构的环路并行化方面的技术缺陷,并且提供了一种用于使用版本控制和调度的并行循环的SIMD代码生成的新颖且非显而易见的方法,系统和计算机程序产品 。 在本发明的一个实施例中,在代码编译数据处理系统中,并行SIMD循环码生成方法可以包括通过用户指令或通过自动并行化来将源代码表示中的循环识别为并行循环候选。 该方法还可以包括响应于针对代码编译数据处理系统的调度策略集以及最小可仿真阈值来选择跳闸计数条件,确定所选循环的跳闸计数和对准约束,以及生成 根据对齐约束在源代码中并行循环,并将行程计数与行程计数条件进行比较。