Packet classification apparatus and method using field level tries
    1.
    发明授权
    Packet classification apparatus and method using field level tries 失效
    数据包分类设备和方法使用字段级别尝试

    公开(公告)号:US07415020B2

    公开(公告)日:2008-08-19

    申请号:US10787298

    申请日:2004-02-27

    IPC分类号: H04L12/28

    摘要: A packet classification apparatus and method using field level tries includes a main processing part for generating and maintaining the field level tries, which organize a multi-field packet by field in a hierarchical structure for classifications; and classification engines, each of which is provided with a first classification part for performing queries and updates and processing a prefix lookup represented by an IP source/destination address lookup, and a second classification part for proceeding with classifications by corresponding field based on a result of the first classification part in order to process a range lookup belonging to the result. Accordingly, tries in the unit of a field are developed so that packet classifications for high-speed networking with excellent query performance are secured, and wherein approximately a half-million classifier rules can be processed.

    摘要翻译: 使用场级尝试的分组分类装置和方法包括用于生成和维护场级尝试的主处理部分,其以用于分类的分级结构逐场地组织多字段; 和分类引擎,每个分类引擎具有用于执行查询和更新并处理由IP源/目的地地址查找表示的前缀查找的第一分类部分和用于基于结果对应的字段进行分类的第二分类部分 的第一分类部分,以处理属于结果的范围查找。 因此,开发出领域单元的尝试,从而确保了用于具有优异查询性能的高速网络的分组分类,并且其中可以处理大约50万分类器规则。

    Single instruction multiple data (SIMD) code generation for parallel loops using versioning and scheduling
    2.
    发明授权
    Single instruction multiple data (SIMD) code generation for parallel loops using versioning and scheduling 失效
    使用版本控制和调度的并行循环的单指令多数据(SIMD)代码生成

    公开(公告)号:US08341615B2

    公开(公告)日:2012-12-25

    申请号:US12172199

    申请日:2008-07-11

    IPC分类号: G06F9/45 G06F9/46

    CPC分类号: G06F8/456

    摘要: Embodiments of the present invention address deficiencies of the art in respect to loop parallelization for a target architecture implementing a shared memory model and provide a novel and non-obvious method, system and computer program product for SIMD code generation for parallel loops using versioning and scheduling. In an embodiment of the invention, within a code compilation data processing system a parallel SIMD loop code generation method can include identifying a loop in a representation of source code as a parallel loop candidate, either through a user directive or through auto-parallelization. The method also can include selecting a trip count condition responsive to a scheduling policy set for the code compilation data processing system and also on a minimal simdizable threshold, determining a trip count and an alignment constraint for the selected loop, and generating a version of a parallel loop in the source code according to the alignment constraint and a comparison of the trip count to the trip count condition.

    摘要翻译: 本发明的实施例解决了实现共享存储器模型的目标架构的环路并行化方面的技术缺陷,并且提供了一种用于使用版本控制和调度的并行循环的SIMD代码生成的新颖且非显而易见的方法,系统和计算机程序产品 。 在本发明的一个实施例中,在代码编译数据处理系统中,并行SIMD循环码生成方法可以包括通过用户指令或通过自动并行化来将源代码表示中的循环识别为并行循环候选。 该方法还可以包括响应于针对代码编译数据处理系统的调度策略集以及最小可仿真阈值来选择跳闸计数条件,确定所选循环的跳闸计数和对准约束,以及生成 根据对齐约束在源代码中并行循环,并将行程计数与行程计数条件进行比较。

    VIRTUAL MEMORY PROTOCOL SEGMENTATION OFFLOADING
    3.
    发明申请
    VIRTUAL MEMORY PROTOCOL SEGMENTATION OFFLOADING 有权
    虚拟内存协议分段卸载

    公开(公告)号:US20090304029A1

    公开(公告)日:2009-12-10

    申请号:US12254931

    申请日:2008-10-21

    IPC分类号: H04J3/24 G06F12/00

    摘要: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing segmentation offloading, such as TCP segmentation offloading (TSO). An interface performs direct virtual memory addressing of a user memory space of a system memory on behalf of a network processor to fetch payload data originated by a user process running on a host processor. Then, the network processor segments the payload data across one or more packets.

    摘要翻译: 提供了更有效地传输网络流量的方法和系统。 根据一个实施例,提供了一种用于执行诸如TCP分段卸载(TSO)的分段卸载的方法。 接口代表网络处理器执行对系统存储器的用户存储器空间的直接虚拟存储器寻址,以提取由主机处理器上运行的用户进程发起的有效载荷数据。 然后,网络处理器通过一个或多个分组分段有效载荷数据。

    Method and system for auto parallelization of zero-trip loops through induction variable substitution

    公开(公告)号:US20060048119A1

    公开(公告)日:2006-03-02

    申请号:US10926594

    申请日:2004-08-26

    IPC分类号: G06F9/45

    CPC分类号: G06F8/443 G06F8/452

    摘要: A method and system of auto parallelization of zero-trip loops that substitutes a nested basic linear induction variable by exploiting a parallelizing compiler is provided. Provided is a use of a max{0,N} variable for loop iterations in case of no information is known about the value of N, for a typical loop iterating from 1 to N, in which N is the loop invariant. For the nested basic induction variables, an induction variable substitution process is applied to the nested loops starting from the innermost loop to the outermost one. Then a removal of the max operator afterwards through a copy propagation pass of the IBM compiler is provided. In doing so, the loop dependency on the induction variable is eliminated and an opportunity for a parallelizing compiler to parallel the outermost loop is provided.

    Mechanism to restrict parallelization of loops
    5.
    发明授权
    Mechanism to restrict parallelization of loops 失效
    限制环路并行化的机制

    公开(公告)号:US08104030B2

    公开(公告)日:2012-01-24

    申请号:US11314456

    申请日:2005-12-21

    IPC分类号: G06F9/44 G06F9/45

    CPC分类号: G06F8/4452

    摘要: A computer implemented method, computer usable program code, and a system for parallelizing a loop. A parameter that will be used to limit parallelization of the loop is identified to limit parallelization of the loop. The parameter specifies a minimum number of loop iterations that a thread should execute. The parameter can be adjusted based on a parallel performance factor. A parallel performance factor is a factor that influences the performance of parallel code. A number of threads from a plurality of threads is selected for processing iterations of the loop based on the parameter. The number of threads is selected prior to execution of the first iteration of the loop.

    摘要翻译: 计算机实现的方法,计算机可用程序代码和用于并行化循环的系统。 确定用于限制环路并联的参数,以限制环路的并行化。 该参数指定线程应执行的最小循环迭代次数。 该参数可以根据并行性能因素进行调整。 并行性能因素是影响并行代码性能的因素。 选择来自多个线程的多个线程用于基于该参数来处理该循环的迭代。 在执行循环的第一次迭代之前选择线程数。

    SINGLE INSTRUCTION MULTIPLE DATA (SIMD) CODE GENERATION FOR PARALLEL LOOPS USING VERSIONING AND SCHEDULING
    6.
    发明申请
    SINGLE INSTRUCTION MULTIPLE DATA (SIMD) CODE GENERATION FOR PARALLEL LOOPS USING VERSIONING AND SCHEDULING 失效
    单一指令多项数据(SIMD)使用版本和调度的平行代码生成代码

    公开(公告)号:US20100011339A1

    公开(公告)日:2010-01-14

    申请号:US12172199

    申请日:2008-07-11

    IPC分类号: G06F9/44

    CPC分类号: G06F8/456

    摘要: Embodiments of the present invention address deficiencies of the art in respect to loop parallelization for a target architecture implementing a shared memory model and provide a novel and non-obvious method, system and computer program product for SIMD code generation for parallel loops using versioning and scheduling. In an embodiment of the invention, within a code compilation data processing system a parallel SIMD loop code generation method can include identifying a loop in a representation of source code as a parallel loop candidate, either through a user directive or through auto-parallelization. The method also can include selecting a trip count condition responsive to a scheduling policy set for the code compilation data processing system and also on a minimal simdizable threshold, determining a trip count and an alignment constraint for the selected loop, and generating a version of a parallel loop in the source code according to the alignment constraint and a comparison of the trip count to the trip count condition.

    摘要翻译: 本发明的实施例解决了实现共享存储器模型的目标架构的环路并行化方面的技术缺陷,并且提供了一种用于使用版本控制和调度的并行循环的SIMD代码生成的新颖且非显而易见的方法,系统和计算机程序产品 。 在本发明的一个实施例中,在代码编译数据处理系统中,并行SIMD循环码生成方法可以包括通过用户指令或通过自动并行化来将源代码表示中的循环识别为并行循环候选。 该方法还可以包括响应于针对代码编译数据处理系统的调度策略集以及最小可仿真阈值来选择跳闸计数条件,确定所选循环的跳闸计数和对准约束,以及生成 根据对齐约束在源代码中并行循环,并将行程计数与行程计数条件进行比较。

    Virtual memory protocol segmentation offloading
    7.
    发明授权
    Virtual memory protocol segmentation offloading 有权
    虚拟内存协议分段卸载

    公开(公告)号:US08411702B2

    公开(公告)日:2013-04-02

    申请号:US13096973

    申请日:2011-04-28

    IPC分类号: H04L12/56

    摘要: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing transport layer protocol segmentation offloading. Multiple buffer descriptors are stored in a system memory of a network device. The buffer descriptors contain information indicative of a starting address of a payload buffer stored in a user memory space of the system memory. The payload buffers contain payload data originated by a user process running on a host processor of the network device. The payload data is retrieved from the payload buffers on behalf of a network processor of the network device without copying the payload data from the user memory space to a kernel memory space of the system memory by performing direct virtual memory addressing of the user memory space. Finally, the payload data is segmented across one or more transport layer protocol packets.

    摘要翻译: 提供了更有效地传输网络流量的方法和系统。 根据一个实施例,提供了一种用于执行传输层协议分段卸载的方法。 多个缓冲区描述符存储在网络设备的系统内存中。 缓冲器描述符包含指示存储在系统存储器的用户存储空间中的有效负载缓冲区的起始地址的信息。 有效载荷缓冲器包含由在网络设备的主机处理器上运行的用户进程发起的有效载荷数据。 通过执行用户存储器空间的直接虚拟存储器寻址,代表网络设备的网络处理器从有效载荷缓冲器检索有效载荷数据,而不将用户存储器空间中的有效载荷数据复制到系统存储器的内核存储器空间。 最后,有效载荷数据跨越一个或多个传输层协议分组进行分段。

    Virtual Memory Protocol Segmentation Offloading
    8.
    发明申请
    Virtual Memory Protocol Segmentation Offloading 有权
    虚拟内存协议分段卸载

    公开(公告)号:US20110200057A1

    公开(公告)日:2011-08-18

    申请号:US13096973

    申请日:2011-04-28

    IPC分类号: H04L12/56

    摘要: Methods and systems for a more efficient transmission of network traffic are provided. According to one embodiment, a method is provided for performing transport layer protocol segmentation offloading. Multiple buffer descriptors are stored in a system memory of a network device. The buffer descriptors contain information indicative of a starting address of a payload buffer stored in a user memory space of the system memory. The payload buffers contain payload data originated by a user process running on a host processor of the network device. The payload data is retrieved from the payload buffers on behalf of a network processor of the network device without copying the payload data from the user memory space to a kernel memory space of the system memory by performing direct virtual memory addressing of the user memory space. Finally, the payload data is segmented across one or more transport layer protocol packets.

    摘要翻译: 提供了更有效地传输网络流量的方法和系统。 根据一个实施例,提供了一种用于执行传输层协议分段卸载的方法。 多个缓冲区描述符存储在网络设备的系统存储器中。 缓冲器描述符包含指示存储在系统存储器的用户存储空间中的有效负载缓冲区的起始地址的信息。 有效载荷缓冲器包含由在网络设备的主机处理器上运行的用户进程发起的有效载荷数据。 通过执行用户存储器空间的直接虚拟存储器寻址,代表网络设备的网络处理器从有效载荷缓冲器检索有效载荷数据,而不将用户存储器空间中的有效载荷数据复制到系统存储器的内核存储器空间。 最后,有效载荷数据跨越一个或多个传输层协议分组进行分段。

    Method and System for Auto Parallelization of Zero-Trip Loops Through the Induction Variable Substitution
    9.
    发明申请
    Method and System for Auto Parallelization of Zero-Trip Loops Through the Induction Variable Substitution 失效
    通过感应变量替代自动并联零行程循环的方法和系统

    公开(公告)号:US20090158018A1

    公开(公告)日:2009-06-18

    申请号:US12356978

    申请日:2009-01-21

    IPC分类号: G06F9/44

    CPC分类号: G06F8/443 G06F8/452

    摘要: A method and system of auto parallelization of zero-trip loops that substitutes a nested basic linear induction variable by exploiting a parallelizing compiler is provided. Provided is a use of a max{0,N} variable for loop iterations in case of no information is known about the value of N, for a typical loop iterating from 1 to N, in which N is the loop invariant. For the nested basic induction variables, an induction variable substitution process is applied to the nested loops starting from the innermost loop to the outermost one. Then a removal of the max operator afterwards through a copy propagation pass of the IBM compiler is provided. In doing so, the loop dependency on the induction variable is eliminated and an opportunity for a parallelizing compiler to parallel the outermost loop is provided.

    摘要翻译: 提供了通过利用并行化编译器代替嵌套的基本线性感应变量的零跳行循环自动并行化的方法和系统。 提供了对于从1到N迭代的典型循环,在没有关于N的值的信息的情况下,使用max {0,N}变量进行循环迭代,其中N是循环不变量。 对于嵌套的基本感应变量,将诱导变量替换过程应用于从最内循环到最外层循环的嵌套循环。 然后,通过IBM编译器的复制传播传递,随后删除最大运算符。 在这样做时,消除了对感应变量的循环依赖性,并且提供并行化编译器并行最外层循环的机会。

    COMPILER METHOD OF EXPLOITING DATA VALUE LOCALITY FOR COMPUTATION REUSE
    10.
    发明申请
    COMPILER METHOD OF EXPLOITING DATA VALUE LOCALITY FOR COMPUTATION REUSE 有权
    用于计算重复使用数据值本地化的编译方法

    公开(公告)号:US20080235674A1

    公开(公告)日:2008-09-25

    申请号:US11688090

    申请日:2007-03-19

    IPC分类号: G06F9/45

    摘要: A compiler method for exploiting data value locality for computation reuse. When a code region having single entry and exit points and in which a potential computation reuse opportunity exists is identified during runtime, a helper thread is created separate from the master thread. One of the helper thread and master thread performs a computation specified in the code region, and the other of the helper thread and master thread looks up a value of the computation previously executed and stored in a lookup table. If the value of the computation previously executed is located in the lookup table, the other thread retrieves the value from the table, and ignores the computation performed by the thread. If the value of the computation is not located, the other thread obtains a result of the computation performed by the thread and stores the result in the lookup table for future computation reuse.

    摘要翻译: 一种用于利用数据值局部性进行计算重用的编译器方法。 当在运行时期间识别具有单个入口点和出口点并且存在潜在的计算重用机会的代码区域时,与主线程分开创建辅助线程。 辅助线程和主线程之一执行代码区域中指定的计算,辅助线程和主线程中的另一个查找先前执行并存储在查找表中的计算值。 如果先前执行的计算值位于查找表中,则另一个线程从表中检索该值,并忽略线程执行的计算。 如果计算值没有定位,则另一个线程获得由线程执行的计算结果,并将结果存储在查找表中以供将来的计算重用。