Single instruction multiple data (SIMD) code generation for parallel loops using versioning and scheduling
    3.
    发明授权
    Single instruction multiple data (SIMD) code generation for parallel loops using versioning and scheduling 失效
    使用版本控制和调度的并行循环的单指令多数据(SIMD)代码生成

    公开(公告)号:US08341615B2

    公开(公告)日:2012-12-25

    申请号:US12172199

    申请日:2008-07-11

    IPC分类号: G06F9/45 G06F9/46

    CPC分类号: G06F8/456

    摘要: Embodiments of the present invention address deficiencies of the art in respect to loop parallelization for a target architecture implementing a shared memory model and provide a novel and non-obvious method, system and computer program product for SIMD code generation for parallel loops using versioning and scheduling. In an embodiment of the invention, within a code compilation data processing system a parallel SIMD loop code generation method can include identifying a loop in a representation of source code as a parallel loop candidate, either through a user directive or through auto-parallelization. The method also can include selecting a trip count condition responsive to a scheduling policy set for the code compilation data processing system and also on a minimal simdizable threshold, determining a trip count and an alignment constraint for the selected loop, and generating a version of a parallel loop in the source code according to the alignment constraint and a comparison of the trip count to the trip count condition.

    摘要翻译: 本发明的实施例解决了实现共享存储器模型的目标架构的环路并行化方面的技术缺陷,并且提供了一种用于使用版本控制和调度的并行循环的SIMD代码生成的新颖且非显而易见的方法,系统和计算机程序产品 。 在本发明的一个实施例中,在代码编译数据处理系统中,并行SIMD循环码生成方法可以包括通过用户指令或通过自动并行化来将源代码表示中的循环识别为并行循环候选。 该方法还可以包括响应于针对代码编译数据处理系统的调度策略集以及最小可仿真阈值来选择跳闸计数条件,确定所选循环的跳闸计数和对准约束,以及生成 根据对齐约束在源代码中并行循环,并将行程计数与行程计数条件进行比较。

    VIRTUAL MEMORY PROTOCOL SEGMENTATION OFFLOADING
    4.
    发明申请
    VIRTUAL MEMORY PROTOCOL SEGMENTATION OFFLOADING 有权
    虚拟内存协议分段卸载

    公开(公告)号:US20090304029A1

    公开(公告)日:2009-12-10

    申请号:US12254931

    申请日:2008-10-21

    IPC分类号: H04J3/24 G06F12/00

    摘要: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing segmentation offloading, such as TCP segmentation offloading (TSO). An interface performs direct virtual memory addressing of a user memory space of a system memory on behalf of a network processor to fetch payload data originated by a user process running on a host processor. Then, the network processor segments the payload data across one or more packets.

    摘要翻译: 提供了更有效地传输网络流量的方法和系统。 根据一个实施例,提供了一种用于执行诸如TCP分段卸载(TSO)的分段卸载的方法。 接口代表网络处理器执行对系统存储器的用户存储器空间的直接虚拟存储器寻址,以提取由主机处理器上运行的用户进程发起的有效载荷数据。 然后,网络处理器通过一个或多个分组分段有效载荷数据。

    Method and system for auto parallelization of zero-trip loops through induction variable substitution

    公开(公告)号:US20060048119A1

    公开(公告)日:2006-03-02

    申请号:US10926594

    申请日:2004-08-26

    IPC分类号: G06F9/45

    CPC分类号: G06F8/443 G06F8/452

    摘要: A method and system of auto parallelization of zero-trip loops that substitutes a nested basic linear induction variable by exploiting a parallelizing compiler is provided. Provided is a use of a max{0,N} variable for loop iterations in case of no information is known about the value of N, for a typical loop iterating from 1 to N, in which N is the loop invariant. For the nested basic induction variables, an induction variable substitution process is applied to the nested loops starting from the innermost loop to the outermost one. Then a removal of the max operator afterwards through a copy propagation pass of the IBM compiler is provided. In doing so, the loop dependency on the induction variable is eliminated and an opportunity for a parallelizing compiler to parallel the outermost loop is provided.

    Mechanism to restrict parallelization of loops
    6.
    发明授权
    Mechanism to restrict parallelization of loops 失效
    限制环路并行化的机制

    公开(公告)号:US08104030B2

    公开(公告)日:2012-01-24

    申请号:US11314456

    申请日:2005-12-21

    IPC分类号: G06F9/44 G06F9/45

    CPC分类号: G06F8/4452

    摘要: A computer implemented method, computer usable program code, and a system for parallelizing a loop. A parameter that will be used to limit parallelization of the loop is identified to limit parallelization of the loop. The parameter specifies a minimum number of loop iterations that a thread should execute. The parameter can be adjusted based on a parallel performance factor. A parallel performance factor is a factor that influences the performance of parallel code. A number of threads from a plurality of threads is selected for processing iterations of the loop based on the parameter. The number of threads is selected prior to execution of the first iteration of the loop.

    摘要翻译: 计算机实现的方法,计算机可用程序代码和用于并行化循环的系统。 确定用于限制环路并联的参数,以限制环路的并行化。 该参数指定线程应执行的最小循环迭代次数。 该参数可以根据并行性能因素进行调整。 并行性能因素是影响并行代码性能的因素。 选择来自多个线程的多个线程用于基于该参数来处理该循环的迭代。 在执行循环的第一次迭代之前选择线程数。

    SINGLE INSTRUCTION MULTIPLE DATA (SIMD) CODE GENERATION FOR PARALLEL LOOPS USING VERSIONING AND SCHEDULING
    7.
    发明申请
    SINGLE INSTRUCTION MULTIPLE DATA (SIMD) CODE GENERATION FOR PARALLEL LOOPS USING VERSIONING AND SCHEDULING 失效
    单一指令多项数据(SIMD)使用版本和调度的平行代码生成代码

    公开(公告)号:US20100011339A1

    公开(公告)日:2010-01-14

    申请号:US12172199

    申请日:2008-07-11

    IPC分类号: G06F9/44

    CPC分类号: G06F8/456

    摘要: Embodiments of the present invention address deficiencies of the art in respect to loop parallelization for a target architecture implementing a shared memory model and provide a novel and non-obvious method, system and computer program product for SIMD code generation for parallel loops using versioning and scheduling. In an embodiment of the invention, within a code compilation data processing system a parallel SIMD loop code generation method can include identifying a loop in a representation of source code as a parallel loop candidate, either through a user directive or through auto-parallelization. The method also can include selecting a trip count condition responsive to a scheduling policy set for the code compilation data processing system and also on a minimal simdizable threshold, determining a trip count and an alignment constraint for the selected loop, and generating a version of a parallel loop in the source code according to the alignment constraint and a comparison of the trip count to the trip count condition.

    摘要翻译: 本发明的实施例解决了实现共享存储器模型的目标架构的环路并行化方面的技术缺陷,并且提供了一种用于使用版本控制和调度的并行循环的SIMD代码生成的新颖且非显而易见的方法,系统和计算机程序产品 。 在本发明的一个实施例中,在代码编译数据处理系统中,并行SIMD循环码生成方法可以包括通过用户指令或通过自动并行化来将源代码表示中的循环识别为并行循环候选。 该方法还可以包括响应于针对代码编译数据处理系统的调度策略集以及最小可仿真阈值来选择跳闸计数条件,确定所选循环的跳闸计数和对准约束,以及生成 根据对齐约束在源代码中并行循环,并将行程计数与行程计数条件进行比较。

    Packet classification apparatus and method using field level tries
    8.
    发明授权
    Packet classification apparatus and method using field level tries 失效
    数据包分类设备和方法使用字段级别尝试

    公开(公告)号:US07415020B2

    公开(公告)日:2008-08-19

    申请号:US10787298

    申请日:2004-02-27

    IPC分类号: H04L12/28

    摘要: A packet classification apparatus and method using field level tries includes a main processing part for generating and maintaining the field level tries, which organize a multi-field packet by field in a hierarchical structure for classifications; and classification engines, each of which is provided with a first classification part for performing queries and updates and processing a prefix lookup represented by an IP source/destination address lookup, and a second classification part for proceeding with classifications by corresponding field based on a result of the first classification part in order to process a range lookup belonging to the result. Accordingly, tries in the unit of a field are developed so that packet classifications for high-speed networking with excellent query performance are secured, and wherein approximately a half-million classifier rules can be processed.

    摘要翻译: 使用场级尝试的分组分类装置和方法包括用于生成和维护场级尝试的主处理部分,其以用于分类的分级结构逐场地组织多字段; 和分类引擎,每个分类引擎具有用于执行查询和更新并处理由IP源/目的地地址查找表示的前缀查找的第一分类部分和用于基于结果对应的字段进行分类的第二分类部分 的第一分类部分,以处理属于结果的范围查找。 因此,开发出领域单元的尝试,从而确保了用于具有优异查询性能的高速网络的分组分类,并且其中可以处理大约50万分类器规则。

    Virtual memory protocol segmentation offloading
    9.
    发明授权
    Virtual memory protocol segmentation offloading 有权
    虚拟内存协议分段卸载

    公开(公告)号:US08411702B2

    公开(公告)日:2013-04-02

    申请号:US13096973

    申请日:2011-04-28

    IPC分类号: H04L12/56

    摘要: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing transport layer protocol segmentation offloading. Multiple buffer descriptors are stored in a system memory of a network device. The buffer descriptors contain information indicative of a starting address of a payload buffer stored in a user memory space of the system memory. The payload buffers contain payload data originated by a user process running on a host processor of the network device. The payload data is retrieved from the payload buffers on behalf of a network processor of the network device without copying the payload data from the user memory space to a kernel memory space of the system memory by performing direct virtual memory addressing of the user memory space. Finally, the payload data is segmented across one or more transport layer protocol packets.

    摘要翻译: 提供了更有效地传输网络流量的方法和系统。 根据一个实施例,提供了一种用于执行传输层协议分段卸载的方法。 多个缓冲区描述符存储在网络设备的系统内存中。 缓冲器描述符包含指示存储在系统存储器的用户存储空间中的有效负载缓冲区的起始地址的信息。 有效载荷缓冲器包含由在网络设备的主机处理器上运行的用户进程发起的有效载荷数据。 通过执行用户存储器空间的直接虚拟存储器寻址,代表网络设备的网络处理器从有效载荷缓冲器检索有效载荷数据,而不将用户存储器空间中的有效载荷数据复制到系统存储器的内核存储器空间。 最后,有效载荷数据跨越一个或多个传输层协议分组进行分段。

    Virtual Memory Protocol Segmentation Offloading
    10.
    发明申请
    Virtual Memory Protocol Segmentation Offloading 有权
    虚拟内存协议分段卸载

    公开(公告)号:US20110200057A1

    公开(公告)日:2011-08-18

    申请号:US13096973

    申请日:2011-04-28

    IPC分类号: H04L12/56

    摘要: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing transport layer protocol segmentation offloading. Multiple buffer descriptors are stored in a system memory of a network device. The buffer descriptors contain information indicative of a starting address of a payload buffer stored in a user memory space of the system memory. The payload buffers contain payload data originated by a user process running on a host processor of the network device. The payload data is retrieved from the payload buffers on behalf of a network processor of the network device without copying the payload data from the user memory space to a kernel memory space of the system memory by performing direct virtual memory addressing of the user memory space. Finally, the payload data is segmented across one or more transport layer protocol packets.

    摘要翻译: 提供了更有效地传输网络流量的方法和系统。 根据一个实施例,提供了一种用于执行传输层协议分段卸载的方法。 多个缓冲区描述符存储在网络设备的系统存储器中。 缓冲器描述符包含指示存储在系统存储器的用户存储空间中的有效负载缓冲区的起始地址的信息。 有效载荷缓冲器包含由在网络设备的主机处理器上运行的用户进程发起的有效载荷数据。 通过执行用户存储器空间的直接虚拟存储器寻址,代表网络设备的网络处理器从有效载荷缓冲器检索有效载荷数据,而不将用户存储器空间中的有效载荷数据复制到系统存储器的内核存储器空间。 最后,有效载荷数据跨越一个或多个传输层协议分组进行分段。