Patent search ap:("NEC Laboratories America Page Inc.") AND inv:"Linhai Song"

1.

发明申请
COMPILER OPTIMIZATION FOR MANY INTEGRATED CORE PROCESSORS 有权
Title translation: 多个集成核心处理器的编译器优化

公开(公告)号：US20150277877A1

公开(公告)日：2015-10-01

申请号：US14667819

申请日：2015-03-25

Applicant: NEC Laboratories America, Inc.

Inventor： Min Feng , Srimat Chakradhar , Linhai Song

IPC: G06F9/45

CPC classification number: G06F8/443 , G06F8/433 , G06F8/51

Abstract: Systems and methods for source-to-source transformation for compiler optimization for many integrated core (MIC) coprocessors, including identifying data dependencies in candidate loops and data elements used in each iteration for arrays, profiling candidate loops to find a proper number m, wherein data transfer and computation for m iterations take an equal amount of time, and creating an outer loop outside the candidate loop, with each iteration of the outer loop executing m iterations of the candidate loop. Data streaming is performed by determining optimum buffer size for one or more arrays and inserting code before the outer loop to create optimum sized buffers, overlapping data transfer between central processing units (CPUs) and MICs with the computation; reusing buffers to reduce memory employed on the MICs, and reusing threads on MICs to repeatedly launch kernels on the MICs for asynchronous data transfer.

Abstract translation: 用于许多集成核心（MIC）协处理器的编译器优化的源到源转换的系统和方法，包括识别用于阵列的每次迭代中使用的候选循环和数据元素中的数据依赖性，分析候选循环以找到适当数量m，其中 m次迭代的数据传输和计算需要等量的时间，并且在候选循环外部创建外部循环，每个外部循环的迭代执行候选循环的m次迭代。通过确定一个或多个阵列的最佳缓冲区大小并在外部循环之前插入代码来创建最佳大小的缓冲区，在中央处理单元（CPU）与MIC之间重叠数据传输与计算来执行数据流; 重用缓冲区以减少在MIC上使用的存储器，并且在MIC上重复使用线程来重复地在MIC上启动内核以进行异步数据传输。

2.

发明授权
Compiler optimization for many integrated core processors 有权
Title translation: 许多集成核心处理器的编译器优化

公开(公告)号：US09471289B2

公开(公告)日：2016-10-18

申请号：US14667819

申请日：2015-03-25

Applicant: NEC Laboratories America, Inc.

Inventor： Min Feng , Srimat Chakradhar , Linhai Song

IPC: G06F9/45

CPC classification number: G06F8/443 , G06F8/433 , G06F8/51

Abstract: Systems and methods for source-to-source transformation for compiler optimization for many integrated core (MIC) coprocessors, including identifying data dependencies in candidate loops and data elements used in each iteration for arrays, profiling candidate loops to find a proper number m, wherein data transfer and computation for m iterations take an equal amount of time, and creating an outer loop outside the candidate loop, with each iteration of the outer loop executing m iterations of the candidate loop. Data streaming is performed by determining optimum buffer size for one or more arrays and inserting code before the outer loop to create optimum sized buffers, overlapping data transfer between central processing units (CPUs) and MICs with the computation; reusing buffers to reduce memory employed on the MICs, and reusing threads on MICs to repeatedly launch kernels on the MICs for asynchronous data transfer.

Abstract translation: 用于许多集成核心（MIC）协处理器的编译器优化的源到源转换的系统和方法，包括识别用于阵列的每次迭代中使用的候选循环和数据元素中的数据依赖性，分析候选循环以找到适当数量m，其中 m次迭代的数据传输和计算需要等量的时间，并且在候选循环外部创建外部循环，每个外部循环的迭代执行候选循环的m次迭代。通过确定一个或多个阵列的最佳缓冲区大小并在外部循环之前插入代码来创建最佳大小的缓冲区，在中央处理单元（CPU）与MIC之间重叠数据传输与计算来执行数据流; 重用缓冲区以减少在MIC上使用的存储器，并且在MIC上重复使用线程来重复地在MIC上启动内核以进行异步数据传输。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification