专利检索 ap:("Xinmin Tian" OR "Milind Girkar" OR "David C. Sehr" OR "Richard Grove" OR "Wei Li" OR "Hong Wang" OR "Chris Newburn" OR "Perry Wang" OR "John Shen") AND inv:"Wei Li" 第 1 页

1.

发明授权
Thread-data affinity optimization using compiler 有权
标题翻译：线程数据亲和力优化使用编译器

公开(公告)号：US08037465B2

公开(公告)日：2011-10-11

申请号：US11242489

申请日：2005-09-30

申请人： Xinmin Tian , Milind Girkar , David C. Sehr , Richard Grove , Wei Li , Hong Wang , Chris Newburn , Perry Wang , John Shen

发明人： Xinmin Tian , Milind Girkar , David C. Sehr , Richard Grove , Wei Li , Hong Wang , Chris Newburn , Perry Wang , John Shen

IPC分类号： G06F9/44 , G06F9/45

CPC分类号： G06F8/45

摘要： Thread-data affinity optimization can be performed by a compiler during the compiling of a computer program to be executed on a cache coherent non-uniform memory access (cc-NUMA) platform. In one embodiment, the present invention includes receiving a program to be compiled. The received program is then compiled in a first pass and executed. During execution, the compiler collects profiling data using a profiling tool. Then, in a second pass, the compiler performs thread-data affinity optimization on the program using the collected profiling data.

摘要翻译： 线程数据亲和度优化可以在编译要在高速缓存相干非均匀内存访问（cc-NUMA）平台上执行的计算机程序时由编译器执行。在一个实施例中，本发明包括接收要编译的程序。接收的程序然后被编译成第一遍并被执行。在执行期间，编译器使用分析工具收集分析数据。然后，在第二遍，编译器使用收集的分析数据对程序执行线程数据关联优化。

2.

发明申请
Thread-data affinity optimization using compiler 有权
标题翻译：线程数据亲和力优化使用编译器

公开(公告)号：US20070079298A1

公开(公告)日：2007-04-05

申请号：US11242489

申请日：2005-09-30

申请人： Xinmin Tian , Milind Girkar , David Sehr , Richard Grove , Wei Li , Hong Wang , Chris Newburn , Perry Wang , John Shen

发明人： Xinmin Tian , Milind Girkar , David Sehr , Richard Grove , Wei Li , Hong Wang , Chris Newburn , Perry Wang , John Shen

IPC分类号： G06F9/45

CPC分类号： G06F8/45

摘要： Thread-data affinity optimization can be performed by a compiler during the compiling of a computer program to be executed on a cache coherent non-uniform memory access (cc-NUMA) platform. In one embodiment, the present invention includes receiving a program to be compiled. The received program is then compiled in a first pass and executed. During execution, the compiler collects profiling data using a profiling tool. Then, in a second pass, the compiler performs thread-data affinity optimization on the program using the collected profiling data.

摘要翻译： 线程数据亲和度优化可以在编译要在高速缓存相干非均匀内存访问（cc-NUMA）平台上执行的计算机程序时由编译器执行。在一个实施例中，本发明包括接收要编译的程序。接收的程序然后被编译成第一遍并被执行。在执行期间，编译器使用分析工具收集分析数据。然后，在第二遍，编译器使用收集的分析数据对程序执行线程数据关联优化。

3.

发明申请
System, method and apparatus for dependency chain processing 有权

公开(公告)号：US20060070047A1

公开(公告)日：2006-03-30

申请号：US10950693

申请日：2004-09-28

申请人： Satish Narayanasamy , Hong Wang , John Shen , Roni Rosner , Yoav Almog , Naftali Schwartz , Gerolf Hoflehner , Daniel LaVery , Wei Li , Xinmin Tian , Milind Girkar , Perry Wang

发明人： Satish Narayanasamy , Hong Wang , John Shen , Roni Rosner , Yoav Almog , Naftali Schwartz , Gerolf Hoflehner , Daniel LaVery , Wei Li , Xinmin Tian , Milind Girkar , Perry Wang

IPC分类号： G06F9/45

CPC分类号： G06F8/443 , G06F8/433 , G06F8/451

摘要： Embodiments of the present invention provide a method, apparatus and system which may include splitting a dependency chain into a set of reduced-width dependency chains; mapping one or more dependency chains onto one or more clustered dependency chain processors, wherein an issue-width of one or more of the clusters is adapted to accommodate a size of the dependency chains; and/or processing in parallel a plurality of dependency chains of a trace. Other embodiments are described and claimed.

4.

发明授权
System, method and apparatus for dependency chain processing 有权
标题翻译：用于依赖关系链处理的系统，方法和装置

公开(公告)号：US07603546B2

公开(公告)日：2009-10-13

申请号：US10950693

申请日：2004-09-28

申请人： Satish Narayanasamy , Hong Wang , John Shen , Roni Rosner , Yoav Almog , Naftali Schwartz , Gerolf Hoflehner , Daniel LaVery , Wei Li , Xinmin Tian , Milind Girkar , Perry Wang

发明人： Satish Narayanasamy , Hong Wang , John Shen , Roni Rosner , Yoav Almog , Naftali Schwartz , Gerolf Hoflehner , Daniel LaVery , Wei Li , Xinmin Tian , Milind Girkar , Perry Wang

IPC分类号： G06F9/00 , G06F9/24 , G06F15/177

CPC分类号： G06F8/443 , G06F8/433 , G06F8/451

摘要： Embodiments of the present invention provide a method, apparatus and system which may include splitting a dependency chain into a set of reduced-width dependency chains; mapping one or more dependency chains onto one or more clustered dependency chain processors, wherein an issue-width of one or more of the clusters is adapted to accommodate a size of the dependency chains; and/or processing in parallel a plurality of dependency chains of a trace. Other embodiments are described and claimed.

摘要翻译： 本发明的实施例提供了一种方法，装置和系统，其可以包括将依赖链分解成一组缩减宽度的依赖性链; 将一个或多个依赖关系链映射到一个或多个聚类依赖链处理器上，其中一个或多个所述簇的问题宽度适于适应所述依赖链的大小; 和/或并行处理多个跟踪的依赖性链。描述和要求保护其他实施例。

5.

发明授权
Fast lock-free post-wait synchronization for exploiting parallelism on multi-core processors 失效
标题翻译：快速无锁后等待同步，以利于多核处理器上的并行性

公开(公告)号：US07571301B2

公开(公告)日：2009-08-04

申请号：US11395841

申请日：2006-03-31

申请人： Arun Kejariwal , Hideki Saito , Xinmin Tian , Milind Girkar , Sanjiv Shah , Wei Li , Utpal Banerjee

发明人： Arun Kejariwal , Hideki Saito , Xinmin Tian , Milind Girkar , Sanjiv Shah , Wei Li , Utpal Banerjee

IPC分类号： G06F9/45 , G06F9/52

CPC分类号： G06F9/3009 , G06F8/458 , G06F9/30087 , G06F9/3836 , G06F9/3838 , G06F9/3851 , G06F9/3855 , G06F9/3857 , G06F9/3891

摘要： A method for improving parallel processing of computer programs. DOACROSS loops and similar code are identified and parallelized using a post-wait control structure. The post-wait control structure may be implemented to include any one of a single counter to enforce an order of execution, an array to track code completion that is indexed by a modulus of a positive integer number, and/or a set of arrays to track a last code completed by a thread and a current code being executed by a thread.

摘要翻译： 一种改进计算机程序并行处理的方法。 DOACROSS循环和类似代码使用后等待控制结构进行标识和并行化。后等待控制结构可以被实现为包括执行执行顺序的单个计数器中的任何一个，用于跟踪由正整数的模数索引的代码完成的数组，和/或一组数组跟踪由线程完成的最后一个代码以及由线程执行的当前代码。

6.

发明申请
Fast lock-free post-wait synchronization for exploiting parallelism on multi-core processors 失效
标题翻译：快速无锁后等待同步，以利于多核处理器上的并行性

公开(公告)号：US20070234326A1

公开(公告)日：2007-10-04

申请号：US11395841

申请日：2006-03-31

申请人： Arun Kejariwal , Hideki Saito , Xinmin Tian , Milind Girkar , Sanjiv Shah , Wei Li , Utpal Banerjee

发明人： Arun Kejariwal , Hideki Saito , Xinmin Tian , Milind Girkar , Sanjiv Shah , Wei Li , Utpal Banerjee

IPC分类号： G06F9/45

CPC分类号： G06F9/3009 , G06F8/458 , G06F9/30087 , G06F9/3836 , G06F9/3838 , G06F9/3851 , G06F9/3855 , G06F9/3857 , G06F9/3891

摘要： A method for improving parallel processing of computer programs. DOACROSS loops and similar code are identified and parallelized using a post-wait control structure. The post-wait control structure may be implemented to include any one of a single counter to enforce an order of execution, an array to track code completion that is indexed by a modulus of a positive integer number, and/or a set of arrays to track a last code completed by a thread and a current code being executed by a thread.

摘要翻译： 一种改进计算机程序并行处理的方法。 DOACROSS循环和类似代码使用后等待控制结构进行标识和并行化。后等待控制结构可以被实现为包括执行执行顺序的单个计数器中的任何一个，用于跟踪由正整数的模数索引的代码完成的数组，和/或一组数组跟踪由线程完成的最后一个代码以及由线程执行的当前代码。

7.

发明授权
Method, system, and program of a compiler to parallelize source code 有权
标题翻译：编译器的方法，系统和程序来并行化源代码

公开(公告)号：US07882498B2

公开(公告)日：2011-02-01

申请号：US11278329

申请日：2006-03-31

申请人： Guilherme D. Ottoni , Xinmin Tian , Hong Wang , Richard A. Hankins , Wei Li , John Shen

发明人： Guilherme D. Ottoni , Xinmin Tian , Hong Wang , Richard A. Hankins , Wei Li , John Shen

IPC分类号： G06F9/45

CPC分类号： G06F8/456 , G06F8/314

摘要： Provided are a method, system, and program for parallelizing source code with a compiler. Source code including source code statements is received. The source code statements are processed to determine a dependency of the statements. Multiple groups of statements are determined from the determined dependency of the statements, wherein statements in one group are dependent on one another. At least one directive is inserted in the source code, wherein each directive is associated with one group of statements. Resulting threaded code is generated including the inserted at least one directive. The group of statements to which the directive in the resulting threaded code applies are processed as a separate task. Each group of statements designated by the directive to be processed as a separate task may be processed concurrently with respect to other groups of statements.

摘要翻译： 提供了一种用于将源代码并行化为编译器的方法，系统和程序。收到包含源代码语句的源代码。处理源代码语句以确定语句的依赖关系。根据确定的语句依赖关系确定多组语句，其中一组中的语句彼此依赖。在源代码中插入至少一个指令，其中每个指令与一组语句相关联。产生的结果线程代码包括插入的至少一个指令。生成的线程代码中的指令所适用的语句组被处理为单独的任务。指定为要作为单独任务处理的指令的每组语句可以与其他语句组并发处理。

8.

发明申请
METHOD, SYSTEM, AND PROGRAM OF A COMPILER TO PARALLELIZE SOURCE CODE 有权
标题翻译：编译器并行源代码的方法，系统和程序

公开(公告)号：US20070234276A1

公开(公告)日：2007-10-04

申请号：US11278329

申请日：2006-03-31

申请人： Guilherme Ottoni , Xinmin Tian , Hong Wang , Richard Hankins , Wei Li , John Shen

发明人： Guilherme Ottoni , Xinmin Tian , Hong Wang , Richard Hankins , Wei Li , John Shen

IPC分类号： G06F9/44

CPC分类号： G06F8/456 , G06F8/314

摘要： Provided are a method, system, and program for parallelizing source code with a compiler. Source code including source code statements is received. The source code statements are processed to determine a dependency of the statements. Multiple groups of statements are determined from the determined dependency of the statements, wherein statements in one group are dependent on one another. At least one directive is inserted in the source code, wherein each directive is associated with one group of statements. Resulting threaded code is generated including the inserted at least one directive. The group of statements to which the directive in the resulting threaded code applies are processed as a separate task. Each group of statements designated by the directive to be processed as a separate task may be processed concurrently with respect to other groups of statements.

摘要翻译： 提供了一种用于将源代码并行化为编译器的方法，系统和程序。收到包含源代码语句的源代码。处理源代码语句以确定语句的依赖关系。根据确定的语句依赖关系确定多组语句，其中一组中的语句彼此依赖。在源代码中插入至少一个指令，其中每个指令与一组语句相关联。产生的结果线程代码包括插入的至少一个指令。生成的线程代码中的指令所适用的语句组被处理为单独的任务。指定为要作为单独任务处理的指令的每组语句可以与其他语句组并发处理。

9.

发明授权
Apparatus to implement mesocode 有权
标题翻译：实现中间码的装置

公开(公告)号：US07260705B2

公开(公告)日：2007-08-21

申请号：US10608316

申请日：2003-06-26

申请人： Hong Wang , John Shen , Perry Wang , Marsha Eng , Gerolf F. Hoflehner , Dan Lavery , Wei Li , Alejandro Ramirez , Ed Grochowski

发明人： Hong Wang , John Shen , Perry Wang , Marsha Eng , Gerolf F. Hoflehner , Dan Lavery , Wei Li , Alejandro Ramirez , Ed Grochowski

IPC分类号： G06F9/30

CPC分类号： G06F9/3853 , G06F8/447 , G06F9/30181 , G06F9/30196 , G06F9/3808 , G06F9/3822 , G06F9/3836 , G06F9/3844

摘要： In one embodiment, the invention provides a method for examining information about branch instructions. A method, comprising: examining information about branch instructions that reach a write-back stage of processing within a processor, defining a plurality of streams based on the examining, wherein each stream comprises a sequence of basic blocks in which only a last block in the sequence ends in a branch instruction, the execution of which causes program flow to branch, the remaining basic blocks in the stream each ending in a branch instruction, the execution of which does not cause program flow to branch.

摘要翻译： 在一个实施例中，本发明提供了一种用于检查关于分支指令的信息的方法。一种方法，包括：检查关于在处理器内达到处理的回写阶段的分支指令的信息，基于所述检查来定义多个流，其中每个流包括一系列基本块，其中仅一序列在分支指令中结束，其执行导致程序流分支，流中的剩余基本块每个以分支指令结束，其执行不导致程序流分支。

10.

发明申请
METHODS AND APPARATUS TO PROVIDE PARAMETERIZED OFFLOADING ON MULTIPROCESSOR ARCHITECTURES 审中-公开
标题翻译：在多处理器架构上提供参数化卸载的方法和装置

公开(公告)号：US20080163183A1

公开(公告)日：2008-07-03

申请号：US11618143

申请日：2006-12-29

申请人： Zhiyuan Li , Xinmin Tian , Wei Li , Hong Wang

发明人： Zhiyuan Li , Xinmin Tian , Wei Li , Hong Wang

IPC分类号： G06F9/45

CPC分类号： G06F8/456 , G06F2209/509

摘要： Methods and apparatus to provide parameterized offloading in multiprocessor systems are disclosed. An example method includes partitioning source code into a first task and a second task, and compiling object code from the source code, such that the first task is compiled to execute on a first processor core and the second task is compiled to execute on a second processor core, the assignment of the first task to the first core being dependent on an input parameter.

摘要翻译： 公开了在多处理器系统中提供参数化卸载的方法和装置。示例性方法包括将源代码分割成第一任务和第二任务，以及从源代码编译目标代码，使得第一任务被编译为在第一处理器核上执行，并且第二任务被编译为在第二任务上执行处理器核心，将第一个任务分配给第一个内核取决于输入参数。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类