Building Approximate Data Dependences with a Moving Window
    1.
    发明申请
    Building Approximate Data Dependences with a Moving Window 失效
    用移动窗口构建近似数据依赖

    公开(公告)号:US20110219222A1

    公开(公告)日:2011-09-08

    申请号:US12717985

    申请日:2010-03-05

    IPC分类号: G06F9/32

    CPC分类号: G06F9/32

    摘要: Mechanisms for building approximate data dependences using a moving look-back window are provided. The mechanisms track dependence information for memory accesses over iterations of execution of a portion of code. The mechanisms receive a memory access of an iteration of the portion of code, the memory access having an address for access the memory and an access type indicating at least one of a read or a write access type. An entry in a moving look-back window data structure is generated corresponding to a memory location accessed by the memory access. The entry comprises at least an identification of the address, the access type, and an iteration number corresponding to the iteration of the memory access. The moving look-back window data structure is utilized to determine dependence information for memory accesses over a plurality of iterations of the portion of code.

    摘要翻译: 提供了使用移动后视窗构建近似数据依赖关系的机制。 机制跟踪代码的一部分执行迭代的存储器访问的依赖信息。 机构接收代码部分的迭代的存储器访问,存储器访问具有用于访问存储器的地址和指示读取或写入访问类型中的至少一个的访问类型。 对应于由存储器访问访问的存储器位置产生移动后视窗数据结构中的条目。 该条目至少包括对应于存储器访问的迭代的地址的标识,访问类型和迭代号。 移动后视窗数据结构用于确定代码部分的多个迭代中的存储器访问的依赖信息。

    System and Method for Selective Code Generation Optimization for an Advanced Dual-Representation Polyhedral Loop Transformation Framework
    3.
    发明申请
    System and Method for Selective Code Generation Optimization for an Advanced Dual-Representation Polyhedral Loop Transformation Framework 失效
    用于高级双表示多面体环转换框架的选择性代码生成优化的系统和方法

    公开(公告)号:US20090083702A1

    公开(公告)日:2009-03-26

    申请号:US11861493

    申请日:2007-09-26

    IPC分类号: G06F9/44

    CPC分类号: G06F8/4452

    摘要: A system and method for selective code generation optimization for an advanced dual-representation polyhedral loop transformation framework are provided. The mechanisms of the illustrative embodiments address the weaknesses of the known polyhedral loop transformation based approaches by providing mechanisms for performing code generation transformations on individual statement instances in an intermediate representation generated by the polyhedral loop transformation optimization of the source code. These code generation transformations have the important property that they do not change program order of the statements in the intermediate representation. This property allows the result of the code generation transformations to be provided back to the polyhedral loop transformation mechanisms in a program statement view, via a new re-entrance path of the illustrative embodiments, for additional optimization.

    摘要翻译: 提供了一种用于高级双表示多面体环转换框架的选择性代码生成优化的系统和方法。 说明性实施例的机制通过提供用于在通过源代码的多面体环转换优化生成的中间表示中对各个语句实例执行代码生成变换的机制来解决已知的基于多面体循环变换的方法的弱点。 这些代码生成转换具有重要的属性,它们不改变中间表示中的语句的程序顺序。 该属性允许通过示例性实施例的新的重新导入路径将代码生成转换的结果提供给程序语句视图中的多面体循环变换机制,用于附加优化。

    Efficient Enqueuing of Values in SIMD Engines with Permute Unit
    4.
    发明申请
    Efficient Enqueuing of Values in SIMD Engines with Permute Unit 审中-公开
    有效排队SIMD发动机与价值单位

    公开(公告)号:US20130151822A1

    公开(公告)日:2013-06-13

    申请号:US13315596

    申请日:2011-12-09

    IPC分类号: G06F9/38

    摘要: Mechanisms, in a data processing system having a processor, for generating enqueued data for performing computations of a conditional branch of code are provided. Mask generation logic of the processor operates to generate a mask representing a subset of iterations of a loop of the code that results in a condition of the conditional branch being satisfied. The mask is used to select data elements from an input data element vector register corresponding to the subset of iterations of the loop of the code that result in the condition of the conditional branch being satisfied. Furthermore, the selected data elements are used to perform computations of the conditional branch of code. Iterations of the loop of the code that do not result in the condition of the conditional branch being satisfied are not used as a basis for performing computations of the conditional branch of code.

    摘要翻译: 提供了在具有处理器的数据处理系统中用于生成用于执行代码的条件分支的计算的入队数据的机制。 处理器的掩码生成逻辑操作以产生代表导致条件分支的条件得到满足的代码循环的迭代子集的掩码。 该掩码用于从输入数据元素向量寄存器中选择数据元素,该数据元素对应于导致满足条件分支条件的代码循环的迭代子集。 此外,所选择的数据元素用于执行代码的条件分支的计算。 不导致满足条件分支的条件的代码的循环的迭代不用作执行代码的条件分支的计算的基础。

    Thread Specific Compiler Generated Customization of Runtime Support for Application Programming Interfaces
    5.
    发明申请
    Thread Specific Compiler Generated Customization of Runtime Support for Application Programming Interfaces 审中-公开
    线程专用编译器生成的应用程序编程接口运行时支持的定制

    公开(公告)号:US20130283250A1

    公开(公告)日:2013-10-24

    申请号:US13453411

    申请日:2012-04-23

    IPC分类号: G06F9/45

    CPC分类号: G06F8/43

    摘要: Mechanisms are provided for generating a customized runtime library for source code. Source code is analyzed to identify a region of code implementing an application programming interface or programming standard of interest. An invocation tree data structure is generated based on results of analysis of functions of the application programming interface or programming standard of interest that the region of code invokes. A custom runtime library is generated based on the invocation tree data structure. The custom runtime library comprises only a subset of runtime library functions, less than a total number of runtime library functions for the application programming interface or programming standard of interest, actually invoked by the region of code and does not include all runtime library functions in the total number of runtime library functions for the application programming interface or programming standard of interest.

    摘要翻译: 提供了用于生成用于源代码的定制运行时库的机制。 分析源代码以识别实现应用编程接口或感兴趣的编程标准的代码区域。 基于应用编程接口的功能分析结果或代码调用的兴趣编程标准的结果生成调用树数据结构。 基于调用树数据结构生成自定义运行时库。 自定义运行时库仅包含运行时库函数的一部分,小于应用程序编程接口的运行时库函数的总数或感兴趣的编程标准,实际上由代码区域调用,并且不包括所有运行时库函数 用于应用程序编程接口的运行时库函数的总数或感兴趣的编程标准。

    Constant Time Worker Thread Allocation Via Configuration Caching
    6.
    发明申请
    Constant Time Worker Thread Allocation Via Configuration Caching 有权
    通过配置缓存来定时工作线程分配

    公开(公告)号:US20120246654A1

    公开(公告)日:2012-09-27

    申请号:US13070811

    申请日:2011-03-24

    IPC分类号: G06F9/46

    CPC分类号: G06F9/5066

    摘要: Mechanisms are provided for allocating threads for execution of a parallel region of code. A request for allocation of worker threads to execute the parallel region of code is received from a master thread. Cached thread allocation information identifying prior thread allocations that have been performed for the master thread are accessed. Worker threads are allocated to the master thread based on the cached thread allocation information. The parallel region of code is executed using the allocated worker threads.

    摘要翻译: 提供了用于分配用于执行并行区域代码的线程的机制。 从主线程接收到用于分配工作线程以执行代码并行区域的请求。 识别为主线程执行的先前线程分配的缓存线程分配信息被访问。 工作线程基于缓存的线程分配信息分配给主线程。 使用分配的工作线程来执行代码的并行区域。

    Arranging Binary Code Based on Call Graph Partitioning
    7.
    发明申请
    Arranging Binary Code Based on Call Graph Partitioning 有权
    基于调用图划分二进制代码

    公开(公告)号:US20110321021A1

    公开(公告)日:2011-12-29

    申请号:US12823244

    申请日:2010-06-25

    IPC分类号: G06F9/45

    CPC分类号: G06F8/4442

    摘要: Mechanisms are provided for arranging binary code to reduce instruction cache conflict misses. These mechanisms generate a call graph of a portion of code. Nodes and edges in the call graph are weighted to generate a weighted call graph. The weighted call graph is then partitioned according to the weights, affinities between nodes of the call graph, and the size of cache lines in an instruction cache of the data processing system, so that binary code associated with one or more subsets of nodes in the call graph are combined into individual cache lines based on the partitioning. The binary code corresponding to the partitioned call graph is then output for execution in a computing device.

    摘要翻译: 提供了用于布置二进制代码以减少指令高速缓存冲突未命中的机制。 这些机制产生一部分代码的调用图。 调用图中的节点和边被加权以生成加权调用图。 然后根据权重,调用图的节点之间的亲和度和数据处理系统的指令高速缓存中的高速缓存行的大小来分配加权调用图,使得与一个或多个节点的子集相关联的二进制代码 调用图被组合到基于分区的各个高速缓存行。 然后输出与划分的调用图对应的二进制代码,以在计算设备中执行。

    Rewriting Branch Instructions Using Branch Stubs
    8.
    发明申请
    Rewriting Branch Instructions Using Branch Stubs 有权
    使用分支存根重写分支指令

    公开(公告)号:US20110321002A1

    公开(公告)日:2011-12-29

    申请号:US12823204

    申请日:2010-06-25

    IPC分类号: G06F9/44 G06F9/45

    摘要: Mechanisms are provided for rewriting branch instructions in a portion of code. The mechanisms receive a portion of source code having an original branch instruction. The mechanisms generate a branch stub for the original branch instruction. The branch stub stores information about the original branch instruction including an original target address of the original branch instruction. Moreover, the mechanisms rewrite the original branch instruction so that a target of the rewritten branch instruction references the branch stub. In addition, the mechanisms output compiled code including the rewritten branch instruction and the branch stub for execution by a computing device. The branch stub is utilized by the computing device at runtime to determine if execution of the rewritten branch instruction can be redirected directly to a target instruction corresponding to the original target address in an instruction cache of the computing device without intervention by an instruction cache runtime system.

    摘要翻译: 提供了用于在一部分代码中重写分支指令的机制。 该机制接收一部分具有原始分支指令的源代码。 机制为原始分支指令生成分支存根。 分支存根存储关于原始分支指令的信息,包括原始分支指令的原始目标地址。 此外,机制重写原始分支指令,使得重写的分支指令的目标引用分支存根。 此外,机制输出编译代码,包括重写的分支指令和分支存根,以供计算设备执行。 计算设备在运行时利用分支存根来确定重写的分支指令的执行是否可以被直接重定向到与计算设备的指令高速缓存中的原始目标地址相对应的目标指令,而无需指令高速缓存运行时系统的干预 。

    Rewriting Branch Instructions Using Branch Stubs
    9.
    发明申请
    Rewriting Branch Instructions Using Branch Stubs 有权
    使用分支存根重写分支指令

    公开(公告)号:US20120204016A1

    公开(公告)日:2012-08-09

    申请号:US13443188

    申请日:2012-04-10

    IPC分类号: G06F9/318

    摘要: Mechanisms are provided for rewriting branch instructions in a portion of code. The mechanisms receive a portion of source code having an original branch instruction. The mechanisms generate a branch stub for the original branch instruction. The branch stub stores information about the original branch instruction including an original target address of the original branch instruction. Moreover, the mechanisms rewrite the original branch instruction so that a target of the rewritten branch instruction references the branch stub. In addition, the mechanisms output compiled code including the rewritten branch instruction and the branch stub for execution by a computing device. The branch stub is utilized by the computing device at runtime to determine if execution of the rewritten branch instruction can be redirected directly to a target instruction corresponding to the original target address in an instruction cache of the computing device without intervention by an instruction cache runtime system.

    摘要翻译: 提供了用于在一部分代码中重写分支指令的机制。 该机制接收一部分具有原始分支指令的源代码。 机制为原始分支指令生成分支存根。 分支存根存储关于原始分支指令的信息,包括原始分支指令的原始目标地址。 此外,机制重写原始分支指令,使得重写的分支指令的目标引用分支存根。 此外,机制输出编译代码,包括重写的分支指令和分支存根,以供计算设备执行。 计算设备在运行时利用分支存根来确定重写的分支指令的执行是否可以被直接重定向到与计算设备的指令高速缓存中的原始目标地址相对应的目标指令,而无需指令高速缓存运行时系统的干预 。

    Dynamically Rewriting Branch Instructions in Response to Cache Line Eviction
    10.
    发明申请
    Dynamically Rewriting Branch Instructions in Response to Cache Line Eviction 有权
    动态重写缓存线驱逐响应中的分支指令

    公开(公告)号:US20120198170A1

    公开(公告)日:2012-08-02

    申请号:US13444890

    申请日:2012-04-12

    IPC分类号: G06F12/08 G06F9/38

    摘要: Mechanisms are provided for evicting cache lines from an instruction cache of the data processing system. The mechanisms store, for a portion of code in a current cache line, a linked list of call sites that directly or indirectly target the portion of code in the current cache line. A determination is made as to whether the current cache line is to be evicted from the instruction cache. The linked list of call sites is processed to identify one or more rewritten branch instructions having associated branch stubs, that either directly or indirectly target the portion of code in the current cache line. In addition, the one or more rewritten branch instructions are rewritten to restore the one or more rewritten branch instructions to an original state based on information in the associated branch stubs.

    摘要翻译: 提供用于从数据处理系统的指令高速缓存中驱逐高速缓存行的机制。 机制存储当前高速缓存行中代码的一部分,直接或间接地定位当前高速缓存行中代码部分的调用站点的链接列表。 确定当前高速缓存行是否将从指令高速缓存中逐出。 处理呼叫站点的链接列表以识别具有相关联的分支存根的一个或多个重写的分支指令,其直接或间接地对目标当前高速缓存行中的代码部分。 此外,重写一个或多个重写的分支指令,以基于相关联的分支存根中的信息将一个或多个重写的分支指令恢复到原始状态。