Methods and apparatus for reducing memory latency in a software application
    31.
    发明授权
    Methods and apparatus for reducing memory latency in a software application 有权
    减少软件应用程序内存延迟的方法和装置

    公开(公告)号:US07328433B2

    公开(公告)日:2008-02-05

    申请号:US10677414

    申请日:2003-10-02

    IPC分类号: G06F9/44

    摘要: Methods and apparatus for reducing memory latency in a software application are disclosed. A disclosed system uses one or more helper threads to prefetch variables for a main thread to reduce performance bottlenecks due to memory latency and/or a cache miss. A performance analysis tool is used to profile the software application's resource usage and identifies areas in the software application experiencing performance bottlenecks. Compiler-runtime instructions are generated into the software application to create and manage the helper thread. The helper thread prefetches data in the identified areas of the software application experiencing performance bottlenecks. A counting mechanism is inserted into the helper thread and a counting mechanism is inserted into the main thread to coordinate the execution of the helper thread with the main thread and to help ensure the prefetched data is not removed from the cache before the main thread is able to take advantage of the prefetched data.

    摘要翻译: 公开了一种用于减少软件应用中的存储器延迟的方法和装置。 所公开的系统使用一个或多个辅助线程来预取主线程的变量,以减少由于存储器延迟和/或高速缓存未命中引起的性能瓶颈。 使用性能分析工具来描述软件应用程序的资源使用情况,并识别遇到性能瓶颈的软件应用程序中的区域。 编译器运行时指令生成到软件应用程序中以创建和管理辅助线程。 辅助线程预取了遇到性能瓶颈的软件应用程序的已识别区域中的数据。 计数机制被插入到辅助线程中,并且计数机制被插入到主线程中以协调辅助线程与主线程的执行,并且有助于确保在主线程可用之前预取数据不被从高速缓存中移除 以利用预取的数据。

    Thread-data affinity optimization using compiler
    35.
    发明申请
    Thread-data affinity optimization using compiler 有权
    线程数据亲和力优化使用编译器

    公开(公告)号:US20070079298A1

    公开(公告)日:2007-04-05

    申请号:US11242489

    申请日:2005-09-30

    IPC分类号: G06F9/45

    CPC分类号: G06F8/45

    摘要: Thread-data affinity optimization can be performed by a compiler during the compiling of a computer program to be executed on a cache coherent non-uniform memory access (cc-NUMA) platform. In one embodiment, the present invention includes receiving a program to be compiled. The received program is then compiled in a first pass and executed. During execution, the compiler collects profiling data using a profiling tool. Then, in a second pass, the compiler performs thread-data affinity optimization on the program using the collected profiling data.

    摘要翻译: 线程数据亲和度优化可以在编译要在高速缓存相干非均匀内存访问(cc-NUMA)平台上执行的计算机程序时由编译器执行。 在一个实施例中,本发明包括接收要编译的程序。 接收的程序然后被编译成第一遍并被执行。 在执行期间,编译器使用分析工具收集分析数据。 然后,在第二遍,编译器使用收集的分析数据对程序执行线程数据关联优化。

    Methods and apparatus for reducing memory latency in a software application
    36.
    发明申请
    Methods and apparatus for reducing memory latency in a software application 有权
    减少软件应用程序内存延迟的方法和装置

    公开(公告)号:US20050086652A1

    公开(公告)日:2005-04-21

    申请号:US10677414

    申请日:2003-10-02

    摘要: Methods and apparatus for reducing memory latency in a software application are disclosed. A disclosed system uses one or more helper threads to prefetch variables for a main thread to reduce performance bottlenecks due to memory latency and/or a cache miss. A performance analysis tool is used to profile the software application's resource usage and identifies areas in the software application experiencing performance bottlenecks. Compiler-runtime instructions are generated into the software application to create and manage the helper thread. The helper thread prefetches data in the identified areas of the software application experiencing performance bottlenecks. A counting mechanism is inserted into the helper thread and a counting mechanism is inserted into the main thread to coordinate the execution of the helper thread with the main thread and to help ensure the prefetched data is not removed from the cache before the main thread is able to take advantage of the prefetched data.

    摘要翻译: 公开了一种用于减少软件应用中的存储器延迟的方法和装置。 所公开的系统使用一个或多个辅助线程来预取主线程的变量,以减少由于存储器延迟和/或高速缓存未命中引起的性能瓶颈。 使用性能分析工具来描述软件应用程序的资源使用情况,并识别遇到性能瓶颈的软件应用程序中的区域。 编译器运行时指令生成到软件应用程序中以创建和管理辅助线程。 辅助线程预取了遇到性能瓶颈的软件应用程序的已识别区域中的数据。 计数机制被插入到辅助线程中,并且计数机制被插入到主线程中以协调辅助线程与主线程的执行,并且有助于确保在主线程可用之前预取数据不被从高速缓存中移除 以利用预取的数据。

    Methods and apparatuses for thread management of multi-threading
    37.
    发明申请
    Methods and apparatuses for thread management of multi-threading 失效
    多线程线程管理方法与设备

    公开(公告)号:US20050081207A1

    公开(公告)日:2005-04-14

    申请号:US10779193

    申请日:2004-02-13

    IPC分类号: G06F9/45 G06F9/46

    CPC分类号: G06F8/441

    摘要: Methods and apparatuses for thread management for multi-threading are described herein. In one embodiment, exemplary process includes selecting, during a compilation of code having one or more threads executable in a data processing system, a current thread having a most bottom order, determining resources allocated to one or more child threads spawned from the current thread, and allocating resources for the current thread in consideration of the resources allocated to the current thread's one or more child threads to avoid resource conflicts between the current thread and its one or more child threads. Other methods and apparatuses are also described.

    摘要翻译: 本文描述了用于多线程的线程管理的方法和装置。 在一个实施例中,示例性过程包括在具有在数据处理系统中可执行的一个或多个线程的代码的编译期间选择具有最低阶的当前线程,确定分配给从当前线程产生的一个或多个子线程的资源, 并且考虑分配给当前线程的一个或多个子线程的资源来为当前线程分配资源,以避免当前线程与其一个或多个子线程之间的资源冲突。 还描述了其它方法和装置。

    METHOD, SYSTEM, AND PROGRAM OF A COMPILER TO PARALLELIZE SOURCE CODE
    40.
    发明申请
    METHOD, SYSTEM, AND PROGRAM OF A COMPILER TO PARALLELIZE SOURCE CODE 有权
    编译器并行源代码的方法,系统和程序

    公开(公告)号:US20070234276A1

    公开(公告)日:2007-10-04

    申请号:US11278329

    申请日:2006-03-31

    IPC分类号: G06F9/44

    CPC分类号: G06F8/456 G06F8/314

    摘要: Provided are a method, system, and program for parallelizing source code with a compiler. Source code including source code statements is received. The source code statements are processed to determine a dependency of the statements. Multiple groups of statements are determined from the determined dependency of the statements, wherein statements in one group are dependent on one another. At least one directive is inserted in the source code, wherein each directive is associated with one group of statements. Resulting threaded code is generated including the inserted at least one directive. The group of statements to which the directive in the resulting threaded code applies are processed as a separate task. Each group of statements designated by the directive to be processed as a separate task may be processed concurrently with respect to other groups of statements.

    摘要翻译: 提供了一种用于将源代码并行化为编译器的方法,系统和程序。 收到包含源代码语句的源代码。 处理源代码语句以确定语句的依赖关系。 根据确定的语句依赖关系确定多组语句,其中一组中的语句彼此依赖。 在源代码中插入至少一个指令,其中每个指令与一组语句相关联。 产生的结果线程代码包括插入的至少一个指令。 生成的线程代码中的指令所适用的语句组被处理为单独的任务。 指定为要作为单独任务处理的指令的每组语句可以与其他语句组并发处理。