Multithreaded Computing
    2.
    发明申请
    Multithreaded Computing 审中-公开
    多线程计算

    公开(公告)号:US20130191852A1

    公开(公告)日:2013-07-25

    申请号:US13606741

    申请日:2012-09-07

    IPC分类号: G06F9/46

    CPC分类号: G06F9/542 G06F9/4843

    摘要: A system, method, and computer program product are provided for improving resource utilization of multithreaded applications. Rather than requiring threads to block while waiting for data from a channel or requiring context switching to minimize blocking, the techniques disclosed herein provide an event-driven approach to launch kernels only when needed to perform operations on channel data, and then terminate in order to free resources. These operations are handled efficiently in hardware, but are flexible enough to be implemented in all manner of programming models.

    摘要翻译: 提供了一种系统,方法和计算机程序产品,用于提高多线程应用程序的资源利用率。 而不是要求线程在等待来自信道的数据或需要上下文切换以最小化阻塞的情况下阻塞,所以本文所公开的技术提供了仅当需要对信道数据执行操作时启动内核的事件驱动方法,然后终止以便 免费资源。 这些操作在硬件中有效地处理,但是具有足够的灵活性,可以在各种编程模型中实现。

    Allocating memory and using the allocated memory in a workgroup in a dispatched data parallel kernel
    5.
    发明授权
    Allocating memory and using the allocated memory in a workgroup in a dispatched data parallel kernel 有权
    在分派的数据并行内核中分配内存并使用分配的内存在工作组中

    公开(公告)号:US09244828B2

    公开(公告)日:2016-01-26

    申请号:US13397391

    申请日:2012-02-15

    IPC分类号: G06F12/02 G06F9/50

    CPC分类号: G06F12/0223 G06F9/5016

    摘要: In a computing system, memory may be managed by using a distributed array, which is a global set of local memory regions. A segment in the distributed array is allocated and is bound to a physical memory region. The segment is used by a workgroup in a dispatched data parallel kernel, wherein a workgroup includes one or more work items. When the distributed array is declared, parameters of the distributed array may be defined. The parameters may include an indication whether the distributed array is persistent (data written to the distributed array during one parallel dispatch is accessible by work items in a subsequent dispatch) or an indication whether the distributed array is shared (nested kernels may access the distributed array). The segment may be deallocated after it has been used.

    摘要翻译: 在计算系统中,可以通过使用分布式阵列来管理存储器,分布式阵列是局部存储器区域的全局集合。 分布式阵列中的段被分配并绑定到物理内存区域。 分段由分派的数据并行内核中的工作组使用,其中工作组包括一个或多个工作项。 当分布式数组被声明时,可以定义分布式数组的参数。 这些参数可以包括指示分布式阵列是否是持久性的(在一次并行调度期间写入分布式阵列的数据可由随后的调度中的工作项访问)或分布式阵列是否共享的指示(嵌套内核可以访问分布式阵列 )。 该段可能在使用后被释放。

    VECTOR WIDTH-AWARE SYNCHRONIZATION-ELISION FOR VECTOR PROCESSORS
    6.
    发明申请
    VECTOR WIDTH-AWARE SYNCHRONIZATION-ELISION FOR VECTOR PROCESSORS 有权
    矢量处理器的矢量宽带同步识别

    公开(公告)号:US20130086566A1

    公开(公告)日:2013-04-04

    申请号:US13249171

    申请日:2011-09-29

    IPC分类号: G06F9/45 G06F15/76

    CPC分类号: G06F8/443 G06F9/45525

    摘要: A medium, method, and apparatus are disclosed for eliding superfluous function invocations in a vector-processing environment. A compiler receives program code comprising a width-contingent invocation of a function. The compiler creates a width-specific executable version of the program code by determining a vector width of a target computer system and omitting the function from the width-specific executable if the vector width meets one or more criteria. For example, the compiler may omit the function call if the vector width is greater than a minimum size.

    摘要翻译: 公开了一种介质,方法和装置,用于在向量处理环境中检测多余的功能调用。 编译器接收包括功能的宽度偶然调用的程序代码。 编译器通过确定目标计算机系统的向量宽度来创建程序代码的宽度特定可执行版本,并且如果向量宽度满足一个或多个标准,则从宽度特定的可执行文件中省略该函数。 例如,如果向量宽度大于最小大小,编译器可以省略函数调用。

    Vector width-aware synchronization-elision for vector processors
    7.
    发明授权
    Vector width-aware synchronization-elision for vector processors 有权
    向量处理器的矢量宽度感知同步检测

    公开(公告)号:US08966461B2

    公开(公告)日:2015-02-24

    申请号:US13249171

    申请日:2011-09-29

    CPC分类号: G06F8/443 G06F9/45525

    摘要: A medium, method, and apparatus are disclosed for eliding superfluous function invocations in a vector-processing environment. A compiler receives program code comprising a width-contingent invocation of a function. The compiler creates a width-specific executable version of the program code by determining a vector width of a target computer system and omitting the function from the width-specific executable if the vector width meets one or more criteria. For example, the compiler may omit the function call if the vector width is greater than a minimum size.

    摘要翻译: 公开了一种介质,方法和装置,用于在向量处理环境中检测多余的功能调用。 编译器接收包括功能的宽度偶然调用的程序代码。 编译器通过确定目标计算机系统的向量宽度来创建程序代码的宽度特定可执行版本,并且如果向量宽度满足一个或多个标准,则从宽度特定的可执行文件中省略该函数。 例如,如果向量宽度大于最小大小,编译器可以省略函数调用。

    ABSTRACTING SCRATCH PAD MEMORIES AS DISTRIBUTED ARRAYS
    8.
    发明申请
    ABSTRACTING SCRATCH PAD MEMORIES AS DISTRIBUTED ARRAYS 有权
    将分段刀片记录片分割为分布式阵列

    公开(公告)号:US20130212350A1

    公开(公告)日:2013-08-15

    申请号:US13397391

    申请日:2012-02-15

    IPC分类号: G06F12/02

    CPC分类号: G06F12/0223 G06F9/5016

    摘要: In a computing system, memory may be managed by using a distributed array, which is a global set of local memory regions. A segment in the distributed array is allocated and is bound to a physical memory region. The segment is used by a workgroup in a dispatched data parallel kernel, wherein a workgroup includes one or more work items. When the distributed array is declared, parameters of the distributed array may be defined. The parameters may include an indication whether the distributed array is persistent (data written to the distributed array during one parallel dispatch is accessible by work items in a subsequent dispatch) or an indication whether the distributed array is shared (nested kernels may access the distributed array). The segment may be deallocated after it has been used.

    摘要翻译: 在计算系统中,可以通过使用分布式阵列来管理存储器,分布式阵列是局部存储器区域的全局集合。 分布式阵列中的段被分配并绑定到物理内存区域。 分段由分派的数据并行内核中的工作组使用,其中工作组包括一个或多个工作项。 当分布式数组被声明时,可以定义分布式数组的参数。 这些参数可以包括指示分布式阵列是否是持久性的(在一次并行调度期间写入分布式阵列的数据可由随后的调度中的工作项访问)或分布式阵列是否共享的指示(嵌套内核可以访问分布式阵列 )。 该段可能在使用后被释放。

    LOW-LEVEL FUNCTION SELECTION USING VECTOR-WIDTH
    9.
    发明申请
    LOW-LEVEL FUNCTION SELECTION USING VECTOR-WIDTH 审中-公开
    使用矢量宽度的低级功能选择

    公开(公告)号:US20130086565A1

    公开(公告)日:2013-04-04

    申请号:US13249154

    申请日:2011-09-29

    IPC分类号: G06F9/45

    CPC分类号: G06F8/41 G06F9/4552

    摘要: A medium and method is disclosed for compiling vector programs. A compiler receives program code that includes a function invocation. The compiler determines the vector width of a target computer system and creates a width-specific executable version of the program code by mapping the function invocation to a width-specific implementation of the function. The width-specific implementation corresponds to the vector width of the target computer system.

    摘要翻译: 公开了用于编译向量程序的介质和方法。 编译器接收包含函数调用的程序代码。 编译器确定目标计算机系统的向量宽度,并通过将函数调用映射到函数的宽度特定实现来创建程序代码的宽度特定可执行版本。 宽度特定实现对应于目标计算机系统的向量宽度。

    Streaming programming generator
    10.
    发明授权
    Streaming programming generator 有权
    流编程生成器

    公开(公告)号:US08856760B2

    公开(公告)日:2014-10-07

    申请号:US12911952

    申请日:2010-10-26

    IPC分类号: G06F9/44

    CPC分类号: G06F8/20

    摘要: A device receives input that includes definitions of components of a computational pipeline, where the components include one or more buffers, one or more kernels, and one or more stages within a control graph. The device generates, based on the input, kernel signatures for a graphics processor, where the kernel signatures compile into an executable streaming program for the computational pipeline. The device also generates, based on the input, host-side runtime code to execute the streaming program.

    摘要翻译: 设备接收包括计算流水线的组件的定义的输入,其中组件包括一个或多个缓冲器,一个或多个内核以及控制图中的一个或多个阶段。 该设备基于输入生成图形处理器的内核签名,其中内核签名被编译成用于计算管道的可执行流程序。 该设备还基于输入生成主机端运行时代码以执行流程序。