Code generation for complex arithmetic reduction for architectures lacking cross data-path support
    1.
    发明申请
    Code generation for complex arithmetic reduction for architectures lacking cross data-path support 有权
    针对缺乏跨数据路径支持的架构的复杂算术减少的代码生成

    公开(公告)号:US20080092124A1

    公开(公告)日:2008-04-17

    申请号:US11548851

    申请日:2006-10-12

    IPC分类号: G06F9/45

    CPC分类号: G06F8/445 G06F8/45

    摘要: A computer implemented method, apparatus, and computer usable program code for compiling source code for performing a complex operation followed by a complex reduction operation. A method is determined for generating executable code for performing the complex operation and the complex reduction operation. Executable code is generated for computing sub-products, reducing the sub-products to intermediate results, and summing the intermediate results to generate a final result in response to a determination that a reduced single instruction multiple data method is appropriate.

    摘要翻译: 一种计算机实现的方法,装置和计算机可用程序代码,用于编译用于执行复杂操作的复杂缩减操作的源代码。 确定用于生成用于执行复杂操作和复合缩减操作的可执行代码的方法。 生成用于计算子产品的可执行代码,将子产品减少到中间结果,并且对中间结果求和以响应于减少的单指令多数据方法的确定而产生最终结果。

    Analyze and reduce number of data reordering operations in SIMD code
    2.
    发明授权
    Analyze and reduce number of data reordering operations in SIMD code 有权
    分析和减少SIMD代码中数据重排序的数量

    公开(公告)号:US08954943B2

    公开(公告)日:2015-02-10

    申请号:US11340452

    申请日:2006-01-26

    IPC分类号: G06F9/45 G06F15/00 G06F15/76

    CPC分类号: G06F8/443

    摘要: A method for analyzing data reordering operations in Single Issue Multiple Data source code and generating executable code therefrom is provided. Input is received. One or more data reordering operations in the input are identified and each data reordering operation in the input is abstracted into a corresponding virtual shuffle operation so that each virtual shuffle operation forms part of an expression tree. One or more virtual shuffle trees are collapsed by combining virtual shuffle operations within at least one of the one or more virtual shuffle trees to form one or more combined virtual shuffle operations, wherein each virtual shuffle tree is a subtree of the expression tree that only contains virtual shuffle operations. Then code is generated for the one or more combined virtual shuffle operations.

    摘要翻译: 提供了一种用于分析单发多数据源代码中的数据重排序操作并从中生成可执行代码的方法。 收到输入。 识别输入中的一个或多个数据重排序操作,并将输入中的每个数据重排序操作抽象为相应的虚拟随机播放操作,使得每个虚拟随机播放操作形成表达式树的一部分。 通过将所述一个或多个虚拟随机播放树中的至少一个中的虚拟随机播放操作组合以形成一个或多个组合的虚拟随机播放操作来折叠一个或多个虚拟洗牌树,其中每个虚拟随机播放树是仅包含表达式树的子树 虚拟随机操作。 然后为一个或多个组合的虚拟随机操作生成代码。

    Efficient generation of SIMD code in presence of multi-threading and other false sharing conditions and in machines having memory protection support
    3.
    发明授权
    Efficient generation of SIMD code in presence of multi-threading and other false sharing conditions and in machines having memory protection support 有权
    在存在多线程和其他虚假共享条件的情况下以及具有存储器保护支持的机器中有效地生成SIMD代码

    公开(公告)号:US07730463B2

    公开(公告)日:2010-06-01

    申请号:US11358372

    申请日:2006-02-21

    IPC分类号: G06F9/45

    CPC分类号: G06F9/3851 G06F8/44

    摘要: A computer implemented method, system and computer program product for automatically generating SIMD code. The method begins by analyzing data to be accessed by a targeted loop including at least one statement, where each statement has at least one memory reference, to determine if memory accesses are safe. If memory accesses are safe, the targeted loop is simdized. If not safe, it is determined if a scheme can be applied in which safety need not be guaranteed. If such a scheme can be applied, the targeted loop is simdized according to the scheme. If such a scheme cannot be applied, it is determined if padding is appropriate. If padding is appropriate, the data is padded and the targeted loop is simdized. If padding is not appropriate, non-simdized code is generated based on the targeted loop for handling boundary conditions, the targeted loop is simdized and combined with the non-simdized code.

    摘要翻译: 一种用于自动生成SIMD代码的计算机实现的方法,系统和计算机程序产品。 该方法开始于分析要由目标循环访问的数据,包括至少一个语句,其中每个语句具有至少一个存储器引用,以确定存储器访问是否安全。 如果存储器访问是安全的,则对象循环被简化。 如果不安全,则确定是否可以应用不需要保证安全性的方案。 如果可以应用这种方案,则根据该方案对目标循环进行模拟。 如果不能应用这种方案,则确定填充是否合适。 如果填充是合适的,则填充数据并对目标循环进行模拟。 如果填充不合适,则基于用于处理边界条件的目标循环生成非模拟代码,目标循环被简化并与非模拟代码组合。

    METHOD TO EXPLOIT SUPERWORD-LEVEL PARALLELISM USING SEMI-ISOMORPHIC PACKING
    4.
    发明申请
    METHOD TO EXPLOIT SUPERWORD-LEVEL PARALLELISM USING SEMI-ISOMORPHIC PACKING 失效
    使用半正交包装开发超级平行平行的方法

    公开(公告)号:US20080127144A1

    公开(公告)日:2008-05-29

    申请号:US11536990

    申请日:2006-09-29

    IPC分类号: G06F9/45

    CPC分类号: G06F8/456

    摘要: A computer program product is provided for extracting SIMD parallelism. The computer program product includes instructions for providing a stream of input code comprising basic blocks; identifying pairs of statements that are semi-isomorphic with respect to each other within a basic block; iteratively combining into packs, pairs of statements that are semi-isomorphic with respect to each other, and combining packs into combined packs; collecting packs whose statements can be scheduled together for processing; and generating SIMD instructions for each pack to provide for extracting the SIMD parallelism.

    摘要翻译: 提供了一种用于提取SIMD并行性的计算机程序产品。 计算机程序产品包括用于提供包括基本块的输入代码流的指令; 识别在基本块内相对于彼此半同构的语句对; 迭代地组合成包,相对于半同构的语句对,以及将包合并成组合包; 收集包,其陈述可以一起安排处理; 并为每个包生成SIMD指令以提供SIMD并行性。

    Method and system for versioning codes based on relative alignment for single instruction multiple data units
    5.
    发明授权
    Method and system for versioning codes based on relative alignment for single instruction multiple data units 失效
    基于单指令多数据单元相对对齐的版本编码方法和系统

    公开(公告)号:US07673284B2

    公开(公告)日:2010-03-02

    申请号:US11333614

    申请日:2006-01-17

    IPC分类号: G06F9/44 G06F9/45

    CPC分类号: G06F8/49

    摘要: A method and system for generating efficient versioned codes for single instruction multiple data units whose memory systems have alignment constraints. The system creates multiple versions of codes based on relative alignments of the data streams involved in the computation. The system also analyzes characteristics of relative alignments (e.g. compile-time or runtime) to determine whether code versioning is beneficial based on a cost model.

    摘要翻译: 一种用于为存储器系统具有对准约束的单指令多数据单元生成有效版本代码的方法和系统。 该系统基于计算中涉及的数据流的相对对齐来创建多个版本的代码。 系统还分析相对比对的特征(例如编译时或运行时),以确定代码版本化是否基于成本模型是有益的。

    GENERATING OPTIMIZED SIMD CODE IN THE PRESENCE OF DATA DEPENDENCES
    6.
    发明申请
    GENERATING OPTIMIZED SIMD CODE IN THE PRESENCE OF DATA DEPENDENCES 有权
    在数据依赖的情况下生成优化的SIMD代码

    公开(公告)号:US20080127059A1

    公开(公告)日:2008-05-29

    申请号:US11535181

    申请日:2006-09-26

    IPC分类号: G06F9/44

    CPC分类号: G06F8/447 G06F8/43

    摘要: A method for generating code, including identifying at least one portion of source code that is simdizable and has a dependence, analyzing the dependence for characteristics, based upon the characteristics, selecting a transformation from a predefined group of transformations, applying the transformation to the at least one portion to generate SIMD code for the at least one portion.

    摘要翻译: 一种用于生成代码的方法,包括识别可仿真并具有依赖性的源代码的至少一部分,基于特征来分析对特征的依赖性,从预定义的变换组中选择变换,将转换应用于 至少一个部分,用于为所述至少一个部分生成SIMD代码。

    Method to exploit superword-level parallelism using semi-isomorphic packing
    7.
    发明授权
    Method to exploit superword-level parallelism using semi-isomorphic packing 失效
    利用半同构包装开发超级平行度的方法

    公开(公告)号:US08136105B2

    公开(公告)日:2012-03-13

    申请号:US11536990

    申请日:2006-09-29

    IPC分类号: G06F9/45

    CPC分类号: G06F8/456

    摘要: A computer program product is provided for extracting SIMD parallelism. The computer program product includes instructions for providing a stream of input code comprising basic blocks; identifying pairs of statements that are semi-isomorphic with respect to each other within a basic block; iteratively combining into packs, pairs of statements that are semi-isomorphic with respect to each other, and combining packs into combined packs; collecting packs whose statements can be scheduled together for processing; and generating SIMD instructions for each pack to provide for extracting the SIMD parallelism..

    摘要翻译: 提供了一种用于提取SIMD并行性的计算机程序产品。 计算机程序产品包括用于提供包括基本块的输入代码流的指令; 识别在基本块内相对于彼此半同构的语句对; 迭代地组合成包,相对于半同构的语句对,以及将包合并成组合包; 收集包,其陈述可以一起安排处理; 并为每个包生成SIMD指令以提供SIMD并行性。

    Generating optimized SIMD code in the presence of data dependences
    8.
    发明授权
    Generating optimized SIMD code in the presence of data dependences 有权
    在存在数据依赖性的情况下生成优化的SIMD代码

    公开(公告)号:US08037464B2

    公开(公告)日:2011-10-11

    申请号:US11535181

    申请日:2006-09-26

    IPC分类号: G06F9/45

    CPC分类号: G06F8/447 G06F8/43

    摘要: A method for generating code, including identifying at least one portion of source code that is simdizable and has a dependence, analyzing the dependence for characteristics, based upon the characteristics, selecting a transformation from a predefined group of transformations, applying the transformation to the at least one portion to generate SIMD code for the at least one portion.

    摘要翻译: 一种用于生成代码的方法,包括识别可仿真并具有依赖性的源代码的至少一部分,基于特征来分析对特征的依赖性,从预定义的变换组中选择变换,将转换应用于 至少一个部分,用于为所述至少一个部分生成SIMD代码。

    Sparse vectorization without hardware gather/scatter
    9.
    发明授权
    Sparse vectorization without hardware gather/scatter 失效
    稀疏矢量化无硬件收集/散射

    公开(公告)号:US08191056B2

    公开(公告)日:2012-05-29

    申请号:US11549172

    申请日:2006-10-13

    IPC分类号: G06F9/45

    CPC分类号: G06F8/447

    摘要: A target operation in a normalized target loop, susceptible of vectorization and which may, after compilation into a vectorized form, seek to operate on data in nonconsecutive physical memory, is identified in source code. Hardware instructions are inserted into executable code generated from the source code, directing a system that will run the executable code to create a representation of the data in consecutive physical memory. A vector loop containing the target operation is replaced, in the executable code, with a function call to a vector library to call a vector function that will operate on the representation to generate a result identical to output expected from executing the vector loop containing the target operation. On execution, a representation of data residing in nonconsecutive physical memory is created in consecutive physical memory, and the vectorized target operation is applied to the representation to process the data.

    摘要翻译: 标准化目标循环中的目标操作,易于向量化,并且可以在编译成向量化形式之后寻求对非连续物理存储器中的数据进行操作,在源代码中被识别。 硬件指令被插入到从源代码生成的可执行代码中,指示将运行可执行代码的系统在连续的物理内存中创建数据的表示。 包含目标操作的向量循环在可执行代码中被替换为对向量库的函数调用,以调用将在表示上操作的向量函数,以生成与执行包含目标的向量循环所期望的输出相同的结果 操作。 在执行时,在连续物理存储器中创建驻留在非连续物理存储器中的数据的表示,并且向量化的目标操作被应用于表示以处理数据。

    Procedure control descriptor-based code specialization for context sensitive memory disambiguation
    10.
    发明授权
    Procedure control descriptor-based code specialization for context sensitive memory disambiguation 有权
    过程控制描述符代码专用于上下文敏感内存消歧

    公开(公告)号:US08332833B2

    公开(公告)日:2012-12-11

    申请号:US11757941

    申请日:2007-06-04

    IPC分类号: G06F9/45

    CPC分类号: G06F8/4441

    摘要: A computer implemented method for facilitating debugging of source code. The source code is scanned to identify a candidate region. A procedure control descriptor is generated, wherein the procedure control descriptor corresponds to the candidate region. The procedure control descriptor identifies, for the candidate region, a condition which, if true at runtime means that the candidate region can be specialized. Responsive to a determination during compile time that satisfaction of at least one condition will be known only at runtime, the procedure control descriptor is used to specialize the candidate region at compile time to create a first version of the candidate region for execution in a case where the condition is true and a second version of the candidate region for execution in a case where the condition is false, and further generate code to correctly select one of the first region and the second region at runtime.

    摘要翻译: 一种用于促进源代码调试的计算机实现方法。 扫描源代码以识别候选区域。 生成过程控制描述符,其中过程控制描述符对应于候选区域。 程序控制描述符为候选区域识别条件,其在运行时为真,意味着候选区域可以是专门的。 在编译期间响应于在运行时仅满足至少一个条件的确定,过程控制描述符用于在编译时专门化候选区域,以在第一版本的候选区域中创建用于执行的候选区域, 条件为真,并且在条件为假的情况下用于执行的候选区域的第二版本,并且还在生成期间生成正确选择第一区域和第二区域中的一个的代码。