METHOD AND APPARATUS FOR EXPLOITING THREAD-LEVEL PARALLELISM
    1.
    发明申请
    METHOD AND APPARATUS FOR EXPLOITING THREAD-LEVEL PARALLELISM 有权
    用于开发螺纹水平并联的方法和装置

    公开(公告)号:US20080244549A1

    公开(公告)日:2008-10-02

    申请号:US11695012

    申请日:2007-03-31

    IPC分类号: G06F9/45

    CPC分类号: G06F8/456

    摘要: According to one example embodiment, there is disclosed herein uses partial recurrence relaxation for parallelizing DOACROSS loops on multi-core computer architectures. By one example definition, a DOACROSS may be a loop that allows successive iterations executing by overlapping; that is, all iterations must impose a partial execution order. According to one embodiment, the inventive subject matter may be used to transform the dependence structure of a given loop with recurrences for maximal degree of thread-level parallelism (TLP), where the threads can be mapped on to either different logical processors (in a hyperthreaded processor) or can be mapped onto different physical cores (or processors) in a multi-core processor.

    摘要翻译: 根据一个示例性实施例,这里公开了在多核计算机体系结构上使用部分递归松弛来并行化DOACROSS循环。 通过一个示例定义,DOACROSS可以是允许通过重叠执行连续迭代的循环; 也就是说,所有迭代必须强制执行部分执行顺序。 根据一个实施例,本发明的主题可以用于以线程级并行度(TLP)的最大程度的递归来转换给定循环的依赖结构,其中线程可以被映射到不同的逻辑处理器(在 超线程处理器)或可以映射到多核处理器中的不同物理核心(或处理器)上。

    Method and apparatus for exploiting thread-level parallelism
    2.
    发明授权
    Method and apparatus for exploiting thread-level parallelism 有权
    利用线程级并行性的方法和装置

    公开(公告)号:US07984431B2

    公开(公告)日:2011-07-19

    申请号:US11695012

    申请日:2007-03-31

    IPC分类号: G06F9/45

    CPC分类号: G06F8/456

    摘要: According to one example embodiment, there is disclosed herein uses partial recurrence relaxation for parallelizing DOACROSS loops on multi-core computer architectures. By one example definition, a DOACROSS may be a loop that allows successive iterations executing by overlapping; that is, all iterations must impose a partial execution order. According to one embodiment, the inventive subject matter may be used to transform the dependence structure of a given loop with recurrences for maximal degree of thread-level parallelism (TLP), where the threads can be mapped on to either different logical processors (in a hyperthreaded processor) or can be mapped onto different physical cores (or processors) in a multi-core processor.

    摘要翻译: 根据一个示例性实施例,这里公开了在多核计算机体系结构上使用部分递归松弛来并行化DOACROSS循环。 通过一个示例定义,DOACROSS可以是允许通过重叠执行连续迭代的循环; 也就是说,所有迭代必须强制执行部分执行顺序。 根据一个实施例,本发明的主题可以用于以线程级并行度(TLP)的最大程度的递归来转换给定循环的依赖结构,其中线程可以被映射到不同的逻辑处理器(在 超线程处理器)或可以映射到多核处理器中的不同物理核心(或处理器)上。

    METHODS AND APPARATUS TO PROVIDE PARAMETERIZED OFFLOADING ON MULTIPROCESSOR ARCHITECTURES
    6.
    发明申请
    METHODS AND APPARATUS TO PROVIDE PARAMETERIZED OFFLOADING ON MULTIPROCESSOR ARCHITECTURES 审中-公开
    在多处理器架构上提供参数化卸载的方法和装置

    公开(公告)号:US20080163183A1

    公开(公告)日:2008-07-03

    申请号:US11618143

    申请日:2006-12-29

    IPC分类号: G06F9/45

    CPC分类号: G06F8/456 G06F2209/509

    摘要: Methods and apparatus to provide parameterized offloading in multiprocessor systems are disclosed. An example method includes partitioning source code into a first task and a second task, and compiling object code from the source code, such that the first task is compiled to execute on a first processor core and the second task is compiled to execute on a second processor core, the assignment of the first task to the first core being dependent on an input parameter.

    摘要翻译: 公开了在多处理器系统中提供参数化卸载的方法和装置。 示例性方法包括将源代码分割成第一任务和第二任务,以及从源代码编译目标代码,使得第一任务被编译为在第一处理器核上执行,并且第二任务被编译为在第二任务上执行 处理器核心,将第一个任务分配给第一个内核取决于输入参数。

    Thread-data affinity optimization using compiler
    7.
    发明申请
    Thread-data affinity optimization using compiler 有权
    线程数据亲和力优化使用编译器

    公开(公告)号:US20070079298A1

    公开(公告)日:2007-04-05

    申请号:US11242489

    申请日:2005-09-30

    IPC分类号: G06F9/45

    CPC分类号: G06F8/45

    摘要: Thread-data affinity optimization can be performed by a compiler during the compiling of a computer program to be executed on a cache coherent non-uniform memory access (cc-NUMA) platform. In one embodiment, the present invention includes receiving a program to be compiled. The received program is then compiled in a first pass and executed. During execution, the compiler collects profiling data using a profiling tool. Then, in a second pass, the compiler performs thread-data affinity optimization on the program using the collected profiling data.

    摘要翻译: 线程数据亲和度优化可以在编译要在高速缓存相干非均匀内存访问(cc-NUMA)平台上执行的计算机程序时由编译器执行。 在一个实施例中,本发明包括接收要编译的程序。 接收的程序然后被编译成第一遍并被执行。 在执行期间,编译器使用分析工具收集分析数据。 然后,在第二遍,编译器使用收集的分析数据对程序执行线程数据关联优化。

    Method, system, and program of a compiler to parallelize source code
    9.
    发明授权
    Method, system, and program of a compiler to parallelize source code 有权
    编译器的方法,系统和程序来并行化源代码

    公开(公告)号:US07882498B2

    公开(公告)日:2011-02-01

    申请号:US11278329

    申请日:2006-03-31

    IPC分类号: G06F9/45

    CPC分类号: G06F8/456 G06F8/314

    摘要: Provided are a method, system, and program for parallelizing source code with a compiler. Source code including source code statements is received. The source code statements are processed to determine a dependency of the statements. Multiple groups of statements are determined from the determined dependency of the statements, wherein statements in one group are dependent on one another. At least one directive is inserted in the source code, wherein each directive is associated with one group of statements. Resulting threaded code is generated including the inserted at least one directive. The group of statements to which the directive in the resulting threaded code applies are processed as a separate task. Each group of statements designated by the directive to be processed as a separate task may be processed concurrently with respect to other groups of statements.

    摘要翻译: 提供了一种用于将源代码并行化为编译器的方法,系统和程序。 收到包含源代码语句的源代码。 处理源代码语句以确定语句的依赖关系。 根据确定的语句依赖关系确定多组语句,其中一组中的语句彼此依赖。 在源代码中插入至少一个指令,其中每个指令与一组语句相关联。 产生的结果线程代码包括插入的至少一个指令。 生成的线程代码中的指令所适用的语句组被处理为单独的任务。 指定为要作为单独任务处理的指令的每组语句可以与其他语句组并发处理。

    Thread-data affinity optimization using compiler
    10.
    发明授权
    Thread-data affinity optimization using compiler 有权
    线程数据亲和力优化使用编译器

    公开(公告)号:US08037465B2

    公开(公告)日:2011-10-11

    申请号:US11242489

    申请日:2005-09-30

    IPC分类号: G06F9/44 G06F9/45

    CPC分类号: G06F8/45

    摘要: Thread-data affinity optimization can be performed by a compiler during the compiling of a computer program to be executed on a cache coherent non-uniform memory access (cc-NUMA) platform. In one embodiment, the present invention includes receiving a program to be compiled. The received program is then compiled in a first pass and executed. During execution, the compiler collects profiling data using a profiling tool. Then, in a second pass, the compiler performs thread-data affinity optimization on the program using the collected profiling data.

    摘要翻译: 线程数据亲和度优化可以在编译要在高速缓存相干非均匀内存访问(cc-NUMA)平台上执行的计算机程序时由编译器执行。 在一个实施例中,本发明包括接收要编译的程序。 接收的程序然后被编译成第一遍并被执行。 在执行期间,编译器使用分析工具收集分析数据。 然后,在第二遍,编译器使用收集的分析数据对程序执行线程数据关联优化。