UNIFIED MEMORY SYSTEMS AND METHODS
    1.
    发明申请
    UNIFIED MEMORY SYSTEMS AND METHODS 审中-公开
    统一的内存系统和方法

    公开(公告)号:US20150206277A1

    公开(公告)日:2015-07-23

    申请号:US14601223

    申请日:2015-01-20

    CPC classification number: G06T1/20 G06F9/5016 G06F12/109 G06T1/60

    Abstract: The present invention facilitates efficient and effective utilization of unified virtual addresses across multiple components. In one embodiment, the presented new approach or solution uses Operating System (OS) allocation on the central processing unit (CPU) combined with graphics processing unit (GPU) driver mappings to provide a unified virtual address (VA) across both GPU and CPU. The new approach helps ensure that a GPU VA pointer does not collide with a CPU pointer provided by OS CPU allocation (e.g., like one returned by “malloc” C runtime API, etc.).

    Abstract translation: 本发明有助于跨多个组件的统一虚拟地址的有效和有效的利用。 在一个实施例中,所提出的新方法或解决方案使用与图形处理单元(GPU)驱动程序映射相结合的中央处理单元(CPU)上的操作系统(OS)分配,以在GPU和CPU两者之间提供统一的虚拟地址(VA)。 新的方法有助于确保GPU VA指针不会与OS CPU分配提供的CPU指针相冲突(例如,像“malloc”C运行时API返回的一样)等。

    TECHNIQUES FOR SHARING PRIORITIES BETWEEN STREAMS OF WORK AND DYNAMIC PARALLELISM
    2.
    发明申请
    TECHNIQUES FOR SHARING PRIORITIES BETWEEN STREAMS OF WORK AND DYNAMIC PARALLELISM 有权
    在工作流动与动态平行线之间共享优先权的技术

    公开(公告)号:US20140344821A1

    公开(公告)日:2014-11-20

    申请号:US13897123

    申请日:2013-05-17

    Abstract: One embodiment sets forth a method for assigning priorities to kernels launched by a software application and executed within a stream of work on a parallel processing subsystem that supports dynamic parallelism. First, the software application assigns a maximum nesting depth for dynamic parallelism. The software application then assigns a stream priority to a stream. These assignments cause a driver to map the stream priority to a device priority and, subsequently, associate the device priority with the stream. As part of the mapping, the driver ensures that each device priority is at least the maximum nesting depth higher than the device priorities associated with any lower priority streams. Subsequently, the driver launches any kernel included in the stream with the device priority associated with the stream. Advantageously, by strategically assigning the maximum nesting depth and prioritizing streams, an application developer may increase the overall processing efficiency of the software application.

    Abstract translation: 一个实施例提出了一种用于为由软件应用发起的内核分配优先级并在支持动态并行性的并行处理子系统的工作流内执行的方法。 首先,软件应用程序为动态并行分配最大嵌套深度。 然后,软件应用程序将流优先级分配给流。 这些分配使驱动程序将流优先级映射到设备优先级,然后将设备优先级与流关联。 作为映射的一部分,驱动程序确保每个设备的优先级至少高于与任何较低优先级流相关联的设备优先级的最大嵌套深度。 随后,驱动程序使用与流相关联的设备优先级启动流中包含的任何内核。 有利地,通过策略性地分配最大嵌套深度和优先化流,应用开发者可以增加软件应用的整体处理效率。

    TECHNIQUES FOR ASSIGNING PRIORITIES TO MEMORY COPIES
    3.
    发明申请
    TECHNIQUES FOR ASSIGNING PRIORITIES TO MEMORY COPIES 有权
    记忆复制优先的技术

    公开(公告)号:US20140344528A1

    公开(公告)日:2014-11-20

    申请号:US13897193

    申请日:2013-05-17

    Abstract: One embodiment sets forth a method for guiding the order in which a parallel processing subsystem executes memory copies. A driver creates semaphores for all but the lowest priority included in a plurality of priorities and associates one priority with each copy hardware channel included in the parallel processing subsystem. The driver then aliases prioritized streams to the copy hardware channels based on the priorities. Upon receiving a request to execute a memory copy within one of the streams, the driver inserts commands into the aliased copy hardware channel. These commands use the semaphores to direct the parallel processing subsystem to execute the memory copy based on the priority of the copy hardware channel. Advantageously, by assigning priorities to streams and, subsequently, strategically requesting memory copies within the prioritized streams, an application developer may fine-tune their software application to increase the overall processing efficiency of the software application.

    Abstract translation: 一个实施例提出了一种用于指导并行处理子系统执行存储器拷贝的顺序的方法。 驱动程序为包含在多个优先级中的除了最低优先级之外的所有者创建信号量,并且将包括在并行处理子系统中的每个复制硬件信道的优先级与一个优先级相关联。 然后,驱动程序根据优先级将优先级流复制到复制硬件通道。 在接收到在其中一个流中执行存储器副本的请求时,驱动程序将命令插入到别名复制硬件通道中。 这些命令使用信号指示并行处理子系统根据复制硬件通道的优先级执行存储器复制。 有利地,通过为流分配优先级,并且随后在优先级流中策略地请求存储器副本,应用开发者可以微调其软件应用以提高软件应用的整体处理效率。

    SELECTIVELY KILLING TRAPPED MULTI-PROCESS SERVICE CLIENTS SHARING THE SAME HARDWARE CONTEXT
    4.
    发明申请
    SELECTIVELY KILLING TRAPPED MULTI-PROCESS SERVICE CLIENTS SHARING THE SAME HARDWARE CONTEXT 有权
    选择性杀死多个进程的服务客户共享相同的硬件条件

    公开(公告)号:US20150206272A1

    公开(公告)日:2015-07-23

    申请号:US14481802

    申请日:2014-09-09

    CPC classification number: G06T1/20 G06F9/5016 G06F12/109 G06T1/60

    Abstract: A method for handling parallel processing clients associated with a server in a GPU, the method comprising: receiving a failure indication for at least client running a thread in the GPU; determining threads in the GPU associated with the failing client; exiting threads in the GPU associated with the failing client; and continuing to execute remaining threads in the GPU for other clients running threads in the GPU.

    Abstract translation: 一种用于处理与GPU中的服务器相关联的并行处理客户端的方法,所述方法包括:至少接收客户端运行GPU中的线程的失败指示; 确定与故障客户端相关联的GPU中的线程; 退出与故障客户端相关联的GPU中的线程; 并继续在GPU中执行剩余的线程以用于在GPU中运行线程的其他客户端。

    TECHNIQUES FOR ASSIGNING PRIORITIES TO STREAMS OF WORK
    5.
    发明申请
    TECHNIQUES FOR ASSIGNING PRIORITIES TO STREAMS OF WORK 有权
    评估优先工作流程的技术

    公开(公告)号:US20140344822A1

    公开(公告)日:2014-11-20

    申请号:US13897291

    申请日:2013-05-17

    Abstract: One embodiment sets forth a method for assigning priorities to kernels launched by a software application and executed within a stream of work on a parallel processing subsystem. First, the software application assigns a desired priority to a stream using a call included in the API. The API receives this call and passes it to a driver. The driver maps the desired priority to an appropriate device priority associated with the parallel processing subsystem. Subsequently, if the software application launches a particular kernel within the stream, then the driver assigns the device priority associated with the stream to the kernel before adding the kernel to the stream for execution on the parallel processing subsystem. Advantageously, by assigning priorities to streams and, subsequently, strategically launching kernels within the prioritized streams, an application developer may fine-tune the software application to increase the overall processing efficiency of the software application.

    Abstract translation: 一个实施例提出了一种用于为由软件应用发起的并且在并行处理子系统的工作流中执行的内核分配优先级的方法。 首先,软件应用程序使用API​​中包含的调用向流中分配所需的优先级。 API接收该呼叫并将其传递给驱动程序。 驱动程序将所需的优先级映射到与并行处理子系统相关联的适当的设备优先级。 随后,如果软件应用程序启动流内的特定内核,则驱动程序将与流相关联的设备优先级分配给内核,然后将内核添加到流中以在并行处理子系统上执行。 有利地,通过为流分配优先级,并且随后在优先级流中策略地启动内核,应用开发者可以微调软件应用以提高软件应用的整体处理效率。

Patent Agency Ranking