COMPILER OPTIMIZATION FOR MANY INTEGRATED CORE PROCESSORS
    1.
    发明申请
    COMPILER OPTIMIZATION FOR MANY INTEGRATED CORE PROCESSORS 有权
    多个集成核心处理器的编译器优化

    公开(公告)号:US20150277877A1

    公开(公告)日:2015-10-01

    申请号:US14667819

    申请日:2015-03-25

    CPC classification number: G06F8/443 G06F8/433 G06F8/51

    Abstract: Systems and methods for source-to-source transformation for compiler optimization for many integrated core (MIC) coprocessors, including identifying data dependencies in candidate loops and data elements used in each iteration for arrays, profiling candidate loops to find a proper number m, wherein data transfer and computation for m iterations take an equal amount of time, and creating an outer loop outside the candidate loop, with each iteration of the outer loop executing m iterations of the candidate loop. Data streaming is performed by determining optimum buffer size for one or more arrays and inserting code before the outer loop to create optimum sized buffers, overlapping data transfer between central processing units (CPUs) and MICs with the computation; reusing buffers to reduce memory employed on the MICs, and reusing threads on MICs to repeatedly launch kernels on the MICs for asynchronous data transfer.

    Abstract translation: 用于许多集成核心(MIC)协处理器的编译器优化的源到源转换的系统和方法,包括识别用于阵列的每次迭代中使用的候选循环和数据元素中的数据依赖性,分析候选循环以找到适当数量m,其中 m次迭代的数据传输和计算需要等量的时间,并且在候选循环外部创建外部循环,每个外部循环的迭代执行候选循环的m次迭代。 通过确定一个或多个阵列的最佳缓冲区大小并在外部循环之前插入代码来创建最佳大小的缓冲区,在中央处理单元(CPU)与MIC之间重叠数据传输与计算来执行数据流; 重用缓冲区以减少在MIC上使用的存储器,并且在MIC上重复使用线程来重复地在MIC上启动内核以进行异步数据传输。

    Compiler optimization for many integrated core processors
    2.
    发明授权
    Compiler optimization for many integrated core processors 有权
    许多集成核心处理器的编译器优化

    公开(公告)号:US09471289B2

    公开(公告)日:2016-10-18

    申请号:US14667819

    申请日:2015-03-25

    CPC classification number: G06F8/443 G06F8/433 G06F8/51

    Abstract: Systems and methods for source-to-source transformation for compiler optimization for many integrated core (MIC) coprocessors, including identifying data dependencies in candidate loops and data elements used in each iteration for arrays, profiling candidate loops to find a proper number m, wherein data transfer and computation for m iterations take an equal amount of time, and creating an outer loop outside the candidate loop, with each iteration of the outer loop executing m iterations of the candidate loop. Data streaming is performed by determining optimum buffer size for one or more arrays and inserting code before the outer loop to create optimum sized buffers, overlapping data transfer between central processing units (CPUs) and MICs with the computation; reusing buffers to reduce memory employed on the MICs, and reusing threads on MICs to repeatedly launch kernels on the MICs for asynchronous data transfer.

    Abstract translation: 用于许多集成核心(MIC)协处理器的编译器优化的源到源转换的系统和方法,包括识别用于阵列的每次迭代中使用的候选循环和数据元素中的数据依赖性,分析候选循环以找到适当数量m,其中 m次迭代的数据传输和计算需要等量的时间,并且在候选循环外部创建外部循环,每个外部循环的迭代执行候选循环的m次迭代。 通过确定一个或多个阵列的最佳缓冲区大小并在外部循环之前插入代码来创建最佳大小的缓冲区,在中央处理单元(CPU)与MIC之间重叠数据传输与计算来执行数据流; 重用缓冲区以减少在MIC上使用的存储器,并且在MIC上重复使用线程来重复地在MIC上启动内核以进行异步数据传输。

Patent Agency Ranking