Opportunistic migration of memory pages in a unified virtual memory system

    公开(公告)号:US10133677B2

    公开(公告)日:2018-11-20

    申请号:US14133489

    申请日:2013-12-18

    Abstract: Techniques are disclosed for transitioning a memory page between memories in a virtual memory subsystem. A unified virtual memory (UVM) driver detects a page fault in response to a memory access request associated with a first memory page, where a local page table does not include an entry corresponding to a virtual memory address included in the memory access request. The UVM driver, in response to the page fault, executes a page fault sequence. The page fault sequence includes modifying the ownership state associated with the first memory page to be central-processing-unit-shared. The page fault sequence further includes scheduling the first memory page for migration from a system memory associated with a central processing unit (CPU) to a local memory associated with a parallel processing unit (PPU). One advantage of the disclosed approach is that the PPU accesses memory pages with greater efficiency.

    Techniques for optimizing stencil buffers

    公开(公告)号:US09679350B2

    公开(公告)日:2017-06-13

    申请号:US14817151

    申请日:2015-08-03

    CPC classification number: G06T1/60 B41F15/34 G06T11/40 G06T15/005

    Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.

    Efficient super-sampling with per-pixel shader threads
    9.
    发明授权
    Efficient super-sampling with per-pixel shader threads 有权
    使用每像素着色器线程进行高效超采样

    公开(公告)号:US09495721B2

    公开(公告)日:2016-11-15

    申请号:US13725782

    申请日:2012-12-21

    CPC classification number: G06T1/20 G06T11/40 G06T15/005 G06T2210/52

    Abstract: Techniques for dispatching pixel information in a graphics processing pipeline. A fragment processing unit generates a pixel that includes multiple samples based on a first portion of a graphics primitive received by a first thread. The fragment processing unit calculates a first value for the first pixel, where the first value is calculated only once for the pixel. The fragment processing unit calculates a first set of values for the samples, where each value in the first set of values corresponds to a different sample and is calculated only once for the corresponding sample. The fragment processing unit combines the first value with each value in the first set of values to create a second set of values. The fragment processing unit creates one or more dispatch messages to store the second set of values in a set of output registers.

    Abstract translation: 在图形处理流水线中调度像素信息的技术。 片段处理单元基于由第一线程接收的图形原语的第一部分生成包括多个样本的像素。 片段处理单元计算第一像素的第一值,其中第一值仅针对像素计算一次。 片段处理单元计算样本的第一组值,其中第一组值中的每个值对应于不同的样本,并且对于相应样本仅计算一次。 片段处理单元将第一值与第一组值中的每个值组合以创建第二组值。 片段处理单元创建一个或多个调度消息以将第二组值存储在一组输出寄存器中。

    System, method, and computer program product for low latency scheduling and launch of memory defined tasks
    10.
    发明授权
    System, method, and computer program product for low latency scheduling and launch of memory defined tasks 有权
    用于低延迟调度和启动内存定义任务的系统,方法和计算机程序产品

    公开(公告)号:US09378139B2

    公开(公告)日:2016-06-28

    申请号:US13890178

    申请日:2013-05-08

    CPC classification number: G06F12/0804 G06F9/4843 G06F12/0802

    Abstract: A system, method, and computer program product for low-latency scheduling and launch of memory defined tasks. The method includes the steps of receiving a task metadata data structure to be stored in a memory associated with a processor, transmitting the task metadata data structure to a scheduling unit of the processor, storing the task metadata data structure in a cache unit included in the scheduling unit, and copying the task metadata data structure from the cache unit to the memory.

    Abstract translation: 一种用于低延迟调度和启动内存定义任务的系统,方法和计算机程序产品。 该方法包括以下步骤:接收要存储在与处理器相关联的存储器中的任务元数据数据结构,将任务元数据结构发送到处理器的调度单元,将任务元数据结构存储在包括在该处理器中的高速缓存单元中 调度单元,以及将任务元数据结构从高速缓存单元复制到存储器。

Patent Agency Ranking