专利检索 ap:("Xinmin Tian" OR "Shih-Wei Liao" OR "Hong Wang" OR "Milind Girkar" OR "John Shen" OR "Perry Wang" OR "Grant Haab" OR "Gerolf Hoflehner" OR "Daniel Lavery" OR "Hideki Saito" OR "Sanjiv Shah" OR "Dongkeun Kim") AND inv:"Gerolf Hoflehner" 第 1 页

1.

发明授权
Methods and apparatus for reducing memory latency in a software application 有权
标题翻译：减少软件应用程序内存延迟的方法和装置

公开(公告)号：US07328433B2

公开(公告)日：2008-02-05

申请号：US10677414

申请日：2003-10-02

申请人： Xinmin Tian , Shih-wei Liao , Hong Wang , Milind Girkar , John Shen , Perry Wang , Grant Haab , Gerolf Hoflehner , Daniel Lavery , Hideki Saito , Sanjiv Shah , Dongkeun Kim

发明人： Xinmin Tian , Shih-wei Liao , Hong Wang , Milind Girkar , John Shen , Perry Wang , Grant Haab , Gerolf Hoflehner , Daniel Lavery , Hideki Saito , Sanjiv Shah , Dongkeun Kim

IPC分类号： G06F9/44

CPC分类号： G06F9/3851 , G06F8/4442 , G06F9/383 , G06F9/4843 , G06F9/52

摘要： Methods and apparatus for reducing memory latency in a software application are disclosed. A disclosed system uses one or more helper threads to prefetch variables for a main thread to reduce performance bottlenecks due to memory latency and/or a cache miss. A performance analysis tool is used to profile the software application's resource usage and identifies areas in the software application experiencing performance bottlenecks. Compiler-runtime instructions are generated into the software application to create and manage the helper thread. The helper thread prefetches data in the identified areas of the software application experiencing performance bottlenecks. A counting mechanism is inserted into the helper thread and a counting mechanism is inserted into the main thread to coordinate the execution of the helper thread with the main thread and to help ensure the prefetched data is not removed from the cache before the main thread is able to take advantage of the prefetched data.

摘要翻译： 公开了一种用于减少软件应用中的存储器延迟的方法和装置。所公开的系统使用一个或多个辅助线程来预取主线程的变量，以减少由于存储器延迟和/或高速缓存未命中引起的性能瓶颈。使用性能分析工具来描述软件应用程序的资源使用情况，并识别遇到性能瓶颈的软件应用程序中的区域。编译器运行时指令生成到软件应用程序中以创建和管理辅助线程。辅助线程预取了遇到性能瓶颈的软件应用程序的已识别区域中的数据。计数机制被插入到辅助线程中，并且计数机制被插入到主线程中以协调辅助线程与主线程的执行，并且有助于确保在主线程可用之前预取数据不被从高速缓存中移除以利用预取的数据。

2.

发明申请
Methods and apparatus for reducing memory latency in a software application 有权
标题翻译：减少软件应用程序内存延迟的方法和装置

公开(公告)号：US20050086652A1

公开(公告)日：2005-04-21

申请号：US10677414

申请日：2003-10-02

申请人： Xinmin Tian , Shih-Wei Liao , Hong Wang , Milind Girkar , John Shen , Perry Wang , Grant Haab , Gerolf Hoflehner , Daniel Lavery , Hideki Saito , Sanjiv Shah , Dongkeun Kim

发明人： Xinmin Tian , Shih-Wei Liao , Hong Wang , Milind Girkar , John Shen , Perry Wang , Grant Haab , Gerolf Hoflehner , Daniel Lavery , Hideki Saito , Sanjiv Shah , Dongkeun Kim

IPC分类号： G06F9/38 , G06F9/45 , G06F9/46 , G06F9/48

CPC分类号： G06F9/3851 , G06F8/4442 , G06F9/383 , G06F9/4843 , G06F9/52

摘要： Methods and apparatus for reducing memory latency in a software application are disclosed. A disclosed system uses one or more helper threads to prefetch variables for a main thread to reduce performance bottlenecks due to memory latency and/or a cache miss. A performance analysis tool is used to profile the software application's resource usage and identifies areas in the software application experiencing performance bottlenecks. Compiler-runtime instructions are generated into the software application to create and manage the helper thread. The helper thread prefetches data in the identified areas of the software application experiencing performance bottlenecks. A counting mechanism is inserted into the helper thread and a counting mechanism is inserted into the main thread to coordinate the execution of the helper thread with the main thread and to help ensure the prefetched data is not removed from the cache before the main thread is able to take advantage of the prefetched data.

摘要翻译： 公开了一种用于减少软件应用中的存储器延迟的方法和装置。所公开的系统使用一个或多个辅助线程来预取主线程的变量，以减少由于存储器延迟和/或高速缓存未命中引起的性能瓶颈。使用性能分析工具来描述软件应用程序的资源使用情况，并识别遇到性能瓶颈的软件应用程序中的区域。编译器运行时指令生成到软件应用程序中以创建和管理辅助线程。辅助线程预取了遇到性能瓶颈的软件应用程序的已识别区域中的数据。计数机制被插入到辅助线程中，并且计数机制被插入到主线程中以协调辅助线程与主线程的执行，并且有助于确保在主线程可用之前预取数据不被从高速缓存中移除以利用预取的数据。

3.

发明申请
Methods and apparatuses for compiler-creating helper threads for multi-threading 审中-公开
标题翻译：用于多线程的编译器创建帮助线程的方法和设备

公开(公告)号：US20050071438A1

公开(公告)日：2005-03-31

申请号：US10676889

申请日：2003-09-30

申请人： Shih-Wei Liao , Xinmin Tian , Gerolf Hoflehner , Hong Wang , Daniel Lavery , Perry Wang , Dongkeun Kim , Milind Girkar , John Shen

发明人： Shih-Wei Liao , Xinmin Tian , Gerolf Hoflehner , Hong Wang , Daniel Lavery , Perry Wang , Dongkeun Kim , Milind Girkar , John Shen

IPC分类号： G06F9/38 , G06F9/45 , G06F15/167

CPC分类号： G06F9/3842 , G06F8/4442 , G06F9/383 , G06F9/3851

摘要： Methods and apparatuses for compiler-created helper thread for multi-threading are described herein. In one embodiment, exemplary process includes identifying a region of a main thread that likely has one or more delinquent loads, the one or more delinquent loads representing loads which likely suffer cache misses during an execution of the main thread, analyzing the region for one or more helper threads with respect to the main thread, and generating code for the one or more helper threads, the one or more helper threads being speculatively executed in parallel with the main thread to perform one or more tasks for the region of the main thread. Other methods and apparatuses are also described.

摘要翻译： 本文描述了用于多线程的编译器创建的辅助线程的方法和装置。在一个实施例中，示例性过程包括识别可能具有一个或多个拖欠负载的主线程的区域，所述一个或多个违规负载表示在执行主线程期间可能遭受高速缓存未命中的负载，分析该区域中的一个或多个相对于主线程的更多帮助线程，以及为一个或多个辅助线程生成代码，一个或多个辅助线程与主线程并行地被推测地执行，以对主线程的区域执行一个或多个任务。还描述了其它方法和装置。

4.

发明申请
Methods and apparatuses for thread management of mult-threading 审中-公开
标题翻译：多线程线程管理方法与设备

公开(公告)号：US20050071841A1

公开(公告)日：2005-03-31

申请号：US10676581

申请日：2003-09-30

申请人： Gerolf Hoflehner , Shih-Wei Liao , Xinmin Tian , Hong Wang , Daniel Lavery , Perry Wang , Dongkeun Kim , Milind Girkar , John Shen

发明人： Gerolf Hoflehner , Shih-Wei Liao , Xinmin Tian , Hong Wang , Daniel Lavery , Perry Wang , Dongkeun Kim , Milind Girkar , John Shen

IPC分类号： G06F9/45 , G06F9/46

CPC分类号： G06F8/441

摘要： Methods and apparatuses for thread management for multi-threading are described herein. In one embodiment, exemplary process includes selecting, during a compilation of code having one or more threads executable in a data processing system, a current thread having a most bottom order, determining resources allocated to one or more child threads spawned from the current thread, and allocating resources for the current thread in consideration of the resources allocated to the current thread's one or more child threads to avoid resource conflicts between the current thread and its one or more child threads. Other methods and apparatuses are also described.

摘要翻译： 本文描述了用于多线程的线程管理的方法和装置。在一个实施例中，示例性过程包括在具有在数据处理系统中可执行的一个或多个线程的代码的编译期间选择具有最低阶的当前线程，确定分配给从当前线程产生的一个或多个子线程的资源，并且考虑分配给当前线程的一个或多个子线程的资源来为当前线程分配资源，以避免当前线程与其一个或多个子线程之间的资源冲突。还描述了其它方法和装置。

5.

发明申请
Methods and apparatuses for thread management of multi-threading 失效
标题翻译：多线程线程管理方法与设备

公开(公告)号：US20050081207A1

公开(公告)日：2005-04-14

申请号：US10779193

申请日：2004-02-13

申请人： Gerolf Hoflehner , Shih-wei Liao , Xinmin Tian , Hong Wang , Daniel Lavery , Perry Wang , Dongkeun Kim , Milind Girkar , John Shen

发明人： Gerolf Hoflehner , Shih-wei Liao , Xinmin Tian , Hong Wang , Daniel Lavery , Perry Wang , Dongkeun Kim , Milind Girkar , John Shen

IPC分类号： G06F9/45 , G06F9/46

CPC分类号： G06F8/441

摘要： Methods and apparatuses for thread management for multi-threading are described herein. In one embodiment, exemplary process includes selecting, during a compilation of code having one or more threads executable in a data processing system, a current thread having a most bottom order, determining resources allocated to one or more child threads spawned from the current thread, and allocating resources for the current thread in consideration of the resources allocated to the current thread's one or more child threads to avoid resource conflicts between the current thread and its one or more child threads. Other methods and apparatuses are also described.

摘要翻译： 本文描述了用于多线程的线程管理的方法和装置。在一个实施例中，示例性过程包括在具有在数据处理系统中可执行的一个或多个线程的代码的编译期间选择具有最低阶的当前线程，确定分配给从当前线程产生的一个或多个子线程的资源，并且考虑分配给当前线程的一个或多个子线程的资源来为当前线程分配资源，以避免当前线程与其一个或多个子线程之间的资源冲突。还描述了其它方法和装置。

6.

发明申请
System, method and apparatus for dependency chain processing 有权

公开(公告)号：US20060070047A1

公开(公告)日：2006-03-30

申请号：US10950693

申请日：2004-09-28

申请人： Satish Narayanasamy , Hong Wang , John Shen , Roni Rosner , Yoav Almog , Naftali Schwartz , Gerolf Hoflehner , Daniel LaVery , Wei Li , Xinmin Tian , Milind Girkar , Perry Wang

发明人： Satish Narayanasamy , Hong Wang , John Shen , Roni Rosner , Yoav Almog , Naftali Schwartz , Gerolf Hoflehner , Daniel LaVery , Wei Li , Xinmin Tian , Milind Girkar , Perry Wang

IPC分类号： G06F9/45

CPC分类号： G06F8/443 , G06F8/433 , G06F8/451

摘要： Embodiments of the present invention provide a method, apparatus and system which may include splitting a dependency chain into a set of reduced-width dependency chains; mapping one or more dependency chains onto one or more clustered dependency chain processors, wherein an issue-width of one or more of the clusters is adapted to accommodate a size of the dependency chains; and/or processing in parallel a plurality of dependency chains of a trace. Other embodiments are described and claimed.

7.

发明授权
System, method and apparatus for dependency chain processing 有权
标题翻译：用于依赖关系链处理的系统，方法和装置

公开(公告)号：US07603546B2

公开(公告)日：2009-10-13

申请号：US10950693

申请日：2004-09-28

申请人： Satish Narayanasamy , Hong Wang , John Shen , Roni Rosner , Yoav Almog , Naftali Schwartz , Gerolf Hoflehner , Daniel LaVery , Wei Li , Xinmin Tian , Milind Girkar , Perry Wang

发明人： Satish Narayanasamy , Hong Wang , John Shen , Roni Rosner , Yoav Almog , Naftali Schwartz , Gerolf Hoflehner , Daniel LaVery , Wei Li , Xinmin Tian , Milind Girkar , Perry Wang

IPC分类号： G06F9/00 , G06F9/24 , G06F15/177

CPC分类号： G06F8/443 , G06F8/433 , G06F8/451

摘要： Embodiments of the present invention provide a method, apparatus and system which may include splitting a dependency chain into a set of reduced-width dependency chains; mapping one or more dependency chains onto one or more clustered dependency chain processors, wherein an issue-width of one or more of the clusters is adapted to accommodate a size of the dependency chains; and/or processing in parallel a plurality of dependency chains of a trace. Other embodiments are described and claimed.

摘要翻译： 本发明的实施例提供了一种方法，装置和系统，其可以包括将依赖链分解成一组缩减宽度的依赖性链; 将一个或多个依赖关系链映射到一个或多个聚类依赖链处理器上，其中一个或多个所述簇的问题宽度适于适应所述依赖链的大小; 和/或并行处理多个跟踪的依赖性链。描述和要求保护其他实施例。

8.

发明申请
Resource-aware scheduling for compilers 失效
标题翻译：编译器的资源感知调度

公开(公告)号：US20050216899A1

公开(公告)日：2005-09-29

申请号：US10809716

申请日：2004-03-24

申请人： Kalyan Muthukumar , Daniel Lavery , Gerolf Hoflehner , Chu-Cheow Lim , Jean-Francois Collard

发明人： Kalyan Muthukumar , Daniel Lavery , Gerolf Hoflehner , Chu-Cheow Lim , Jean-Francois Collard

IPC分类号： G06F9/45

CPC分类号： G06F8/445

摘要： Disclosed are embodiments of a compiler, methods, and system for resource-aware scheduling of instructions. A list scheduling approach is augmented to take into account resource constraints when determining priority for scheduling of instructions. Other embodiments are also described and claimed.

摘要翻译： 公开了用于指令的资源感知调度的编译器，方法和系统的实施例。增加列表调度方法以在确定指令调度的优先级时考虑到资源约束。还描述和要求保护其他实施例。

9.

发明申请
Methods and apparatus for dynamic register scratching 有权
标题翻译：动态寄存器划伤的方法和装置

公开(公告)号：US20070234012A1

公开(公告)日：2007-10-04

申请号：US11395373

申请日：2006-03-31

申请人： Gerolf Hoflehner , Mark Davis

发明人： Gerolf Hoflehner , Mark Davis

IPC分类号： G06F9/30

CPC分类号： G06F9/3004 , G06F8/441 , G06F9/30134 , G06F9/30145 , G06F9/384

摘要： Apparatus and methods of reducing dynamic memory stack by a register stack engine are disclosed. An example apparatus and method identifies a local parameter of a caller function. A scratch register corresponding to the local parameter is moved to the top of a register stack, and a local parameter of a callee function is assigned to the scratch register.

摘要翻译： 公开了通过寄存器堆栈引擎减少动态存储器堆栈的装置和方法。示例性装置和方法识别呼叫者功能的本地参数。与本地参数相对应的暂存寄存器移动到寄存器堆栈的顶部，并将被调用函数的本地参数分配给临时寄存器。

10.

发明授权
Methods and apparatus for dynamic register scratching 有权
标题翻译：动态寄存器划伤的方法和装置

公开(公告)号：US07647482B2

公开(公告)日：2010-01-12

申请号：US11395373

申请日：2006-03-31

申请人： Gerolf Hoflehner , Mark Davis

发明人： Gerolf Hoflehner , Mark Davis

IPC分类号： G06F12/00

CPC分类号： G06F9/3004 , G06F8/441 , G06F9/30134 , G06F9/30145 , G06F9/384

摘要： Apparatus and methods of reducing dynamic memory stack by a register stack engine are disclosed. An example apparatus and method identifies a local parameter of a caller function. A scratch register corresponding to the local parameter is moved to the top of a register stack, and a local parameter of a callee function is assigned to the scratch register.

摘要翻译： 公开了通过寄存器堆栈引擎减少动态存储器堆栈的装置和方法。示例性装置和方法识别呼叫者功能的本地参数。与本地参数相对应的暂存寄存器移动到寄存器堆栈的顶部，并将被调用函数的本地参数分配给临时寄存器。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类