Patent search ap:("NVIDIA CORPORATION") AND inv:"Jerome F. Duluk Page Jr."

1.

发明授权
Techniques for configuring a processor to function as multiple, separate processors 有权

公开(公告)号：US11893423B2

公开(公告)日：2024-02-06

申请号：US16562367

申请日：2019-09-05

Applicant: NVIDIA CORPORATION

Inventor： Jerome F. Duluk, Jr. , Gregory Scott Palmer , Jonathon Stuart Ramsey Evans , Shailendra Singh , Samuel H. Duncan , Wishwesh Anil Gandhi , Lacky V. Shah , Sonata Gale Wen , Feiqi Su , James Leroy Deming , Alan Menezes , Pranav Vaidya , Praveen Joginipally , Timothy John Purcell , Manas Mandal

IPC: G06F9/50 , G06F9/38 , G06F1/3296 , G06F1/04

CPC classification number: G06F9/5061 , G06F1/04 , G06F1/3296 , G06F9/3877 , G06F9/5027

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

2.

发明授权
Techniques for configuring a processor to function as multiple, separate processors 有权

公开(公告)号：US11663036B2

公开(公告)日：2023-05-30

申请号：US16562359

申请日：2019-09-05

Applicant: NVIDIA CORPORATION

Inventor： Jerome F. Duluk, Jr. , Gregory Scott Palmer , Jonathon Stuart Ramsey Evans , Shailendra Singh , Samuel H. Duncan , Wishwesh Anil Gandhi , Lacky V. Shah , Eric Rock , Feiqi Su , James Leroy Deming , Alan Menezes , Pranav Vaidya , Praveen Joginipally , Timothy John Purcell , Manas Mandal

IPC: G06F9/48 , G06F9/46 , G06T1/20

CPC classification number: G06F9/485 , G06F9/461 , G06T1/20

Abstract: A parallel processing unit (PPU) can be divided into partitions. Each partition is configured to operate similarly to how the entire PPU operates. A given partition includes a subset of the computational and memory resources associated with the entire PPU. Software that executes on a CPU partitions the PPU for an admin user. A guest user is assigned to a partition and can perform processing tasks within that partition in isolation from any other guest users assigned to any other partitions. Because the PPU can be divided into isolated partitions, multiple CPU processes can efficiently utilize PPU resources.

3.

发明授权
Fault buffer for tracking page faults in unified virtual memory system 有权

公开(公告)号：US11487673B2

公开(公告)日：2022-11-01

申请号：US14055345

申请日：2013-10-16

Applicant: NVIDIA Corporation

Inventor： Jerome F. Duluk, Jr. , Cameron Buschardt , Sherry Cheung , James Leroy Deming , Samuel H. Duncan , Lucien Dunning , Robert George , Arvind Gopalakrishnan , Mark Hairgrove , Chenghuan Jia , John Mashey

IPC: G06F13/12 , G06F13/38 , G06F12/1009 , G06F12/109 , G06F12/1072 , G06F12/08 , G06F11/07 , G06F12/12 , G06F12/10

Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.

4.

发明授权
Opportunistic migration of memory pages in a unified virtual memory system 有权

公开(公告)号：US10133677B2

公开(公告)日：2018-11-20

申请号：US14133489

申请日：2013-12-18

Applicant: NVIDIA CORPORATION

Inventor： Jerome F. Duluk, Jr. , Cameron Buschardt , James Leroy Deming , Lucien Dunning , Brian Fahs , Mark Hairgrove , John Mashey

IPC: G06F12/12 , G06F12/122 , G06F12/08 , G06F12/1009

Abstract: Techniques are disclosed for transitioning a memory page between memories in a virtual memory subsystem. A unified virtual memory (UVM) driver detects a page fault in response to a memory access request associated with a first memory page, where a local page table does not include an entry corresponding to a virtual memory address included in the memory access request. The UVM driver, in response to the page fault, executes a page fault sequence. The page fault sequence includes modifying the ownership state associated with the first memory page to be central-processing-unit-shared. The page fault sequence further includes scheduling the first memory page for migration from a system memory associated with a central processing unit (CPU) to a local memory associated with a parallel processing unit (PPU). One advantage of the disclosed approach is that the PPU accesses memory pages with greater efficiency.

5.

发明授权
Efficient memory virtualization in multi-threaded processing units 有权

公开(公告)号：US10037228B2

公开(公告)日：2018-07-31

申请号：US13660763

申请日：2012-10-25

Applicant: NVIDIA Corporation

Inventor： Nick Barrow-Williams , Brian Fahs , Jerome F. Duluk, Jr. , James Leroy Deming , Timothy John Purcell , Lucien Dunning , Mark Hairgrove

IPC: G06F12/00 , G06F9/50 , G06F12/1045 , G06F12/109

CPC classification number: G06F9/5027 , G06F12/1036 , G06F12/1045 , G06F12/109

Abstract: A technique for simultaneously executing multiple tasks, each having an independent virtual address space, involves assigning an address space identifier (ASID) to each task and constructing each virtual memory access request to include both a virtual address and the ASID. During virtual to physical address translation, the ASID selects a corresponding page table, which includes virtual to physical address mappings for the ASID and associated task. Entries for a translation look-aside buffer (TLB) include both the virtual address and ASID to complete each mapping to a physical address. Deep scheduling of tasks sharing a virtual address space may be implemented to improve cache affinity for both TLB and data caches.

6.

发明授权
Page state directory for managing unified virtual memory 有权

公开(公告)号：US09767036B2

公开(公告)日：2017-09-19

申请号：US14055318

申请日：2013-10-16

Applicant: NVIDIA Corporation

Inventor： Jerome F. Duluk, Jr. , Cameron Buschardt , Sherry Cheung , James Leroy Deming , Samuel H. Duncan , Lucien Dunning , Robert George , Arvind Gopalakrishnan , Mark Hairgrove , Chenghuan Jia , John Mashey

IPC: G06F12/10 , G06F12/1009 , G06F12/109 , G06F12/1072 , G06F11/07 , G06F12/12 , G06F12/08

CPC classification number: G06F12/1009 , G06F11/073 , G06F11/0793 , G06F12/08 , G06F12/10 , G06F12/1072 , G06F12/109 , G06F12/12 , G06F2212/1016

Abstract: A system for managing virtual memory. The system includes a first processing unit configured to execute a first operation that references a first virtual memory address. The system also includes a first memory management unit (MMU) associated with the first processing unit and configured to generate a first page fault upon determining that a first page table that is stored in a first memory unit associated with the first processing unit does not include a mapping corresponding to the first virtual memory address. The system further includes a first copy engine associated with the first processing unit. The first copy engine is configured to read a first command queue to determine a first mapping that corresponds to the first virtual memory address and is included in a first page state directory. The first copy engine is also configured to update the first page table to include the first mapping.

7.

发明授权
Techniques for optimizing stencil buffers 有权

公开(公告)号：US09679350B2

公开(公告)日：2017-06-13

申请号：US14817151

申请日：2015-08-03

Applicant: NVIDIA CORPORATION

Inventor： Eric B. Lum , Jerome F. Duluk, Jr.

IPC: G06T1/60 , B41F15/34 , G06T15/00 , G06T11/40

CPC classification number: G06T1/60 , B41F15/34 , G06T11/40 , G06T15/005

Abstract: One embodiment sets forth a method for associating each stencil value included in a stencil buffer with multiple fragments. Components within a graphics processing pipeline use a set of stencil masks to partition the bits of each stencil value. Each stencil mask selects a different subset of bits, and each fragment is strategically associated with both a stencil value and a stencil mask. Before performing stencil actions associated with a fragment, the raster operations unit performs stencil mask operations on the operands. No fragments are associated with both the same stencil mask and the same stencil value. Consequently, no fragments are associated with the same stencil bits included in the stencil buffer. Advantageously, by reducing the number of stencil bits associated with each fragment, certain classes of software applications may reduce the wasted memory associated with stencil buffers in which each stencil value is associated with a single fragment.

8.

发明授权
Higher accuracy Z-culling in a tile-based architecture 有权

公开(公告)号：US09612839B2

公开(公告)日：2017-04-04

申请号：US14061443

申请日：2013-10-23

Applicant: NVIDIA Corporation

Inventor： Ziyad S. Hakura , Jerome F. Duluk, Jr.

IPC: G06F9/38 , G06T15/00 , G06T15/40 , G06T1/20 , G06T1/60 , G09G5/395 , G09G5/00 , G06T15/50 , G06F12/0808 , G06F12/0875 , G06F9/44 , G06T15/80

CPC classification number: G06T1/20 , G06F9/38 , G06F9/44 , G06F12/0808 , G06F12/0875 , G06F2212/302 , G06T1/60 , G06T15/005 , G06T15/405 , G06T15/503 , G06T15/80 , G06T17/20 , G09G5/003 , G09G5/395 , Y02D10/13

Abstract: A graphics processing pipeline configured for z-cull operations. The graphics processing pipeline comprising a screen-space pipeline and a tiling unit. The screen-space pipeline includes a z-cull unit configured to perform z-culling operations. The tiling unit is configured to determine that a first set of primitives overlaps a first cache tile. The tiling unit is also configured to transmit the first set of primitives to the screen-space pipeline for processing. The tiling unit is further configured to select between processing the first set of primitives in a full-surface z-cull mode or processing the first set of primitives in a partial-surface z-cull mode. The tiling unit is also configured to cause the z-cull unit to process the first set of primitives in the full-surface z-cull mode or to process the first set of primitives in the partial-surface z-cull mode.

9.

发明授权
Efficient super-sampling with per-pixel shader threads 有权
Title translation: 使用每像素着色器线程进行高效超采样

公开(公告)号：US09495721B2

公开(公告)日：2016-11-15

申请号：US13725782

申请日：2012-12-21

Applicant: NVIDIA CORPORATION

Inventor： Jerome F. Duluk, Jr. , Rouslan Dimitrov , Eric Lum , Rui Bastos

IPC: G06T1/20 , G06T15/00

CPC classification number: G06T1/20 , G06T11/40 , G06T15/005 , G06T2210/52

Abstract: Techniques for dispatching pixel information in a graphics processing pipeline. A fragment processing unit generates a pixel that includes multiple samples based on a first portion of a graphics primitive received by a first thread. The fragment processing unit calculates a first value for the first pixel, where the first value is calculated only once for the pixel. The fragment processing unit calculates a first set of values for the samples, where each value in the first set of values corresponds to a different sample and is calculated only once for the corresponding sample. The fragment processing unit combines the first value with each value in the first set of values to create a second set of values. The fragment processing unit creates one or more dispatch messages to store the second set of values in a set of output registers.

Abstract translation: 在图形处理流水线中调度像素信息的技术。片段处理单元基于由第一线程接收的图形原语的第一部分生成包括多个样本的像素。片段处理单元计算第一像素的第一值，其中第一值仅针对像素计算一次。片段处理单元计算样本的第一组值，其中第一组值中的每个值对应于不同的样本，并且对于相应样本仅计算一次。片段处理单元将第一值与第一组值中的每个值组合以创建第二组值。片段处理单元创建一个或多个调度消息以将第二组值存储在一组输出寄存器中。

10.

发明授权
System, method, and computer program product for low latency scheduling and launch of memory defined tasks 有权
Title translation: 用于低延迟调度和启动内存定义任务的系统，方法和计算机程序产品

公开(公告)号：US09378139B2

公开(公告)日：2016-06-28

申请号：US13890178

申请日：2013-05-08

Applicant: NVIDIA Corporation

Inventor： Scott Ricketts , Brian Scott Pharris , Nicholas Wang , Luke David Durant , Philip Alexander Cuadra , Jerome F. Duluk, Jr.

IPC: G06F12/02 , G06F12/08 , G06F9/48

CPC classification number: G06F12/0804 , G06F9/4843 , G06F12/0802

Abstract: A system, method, and computer program product for low-latency scheduling and launch of memory defined tasks. The method includes the steps of receiving a task metadata data structure to be stored in a memory associated with a processor, transmitting the task metadata data structure to a scheduling unit of the processor, storing the task metadata data structure in a cache unit included in the scheduling unit, and copying the task metadata data structure from the cache unit to the memory.

Abstract translation: 一种用于低延迟调度和启动内存定义任务的系统，方法和计算机程序产品。该方法包括以下步骤：接收要存储在与处理器相关联的存储器中的任务元数据数据结构，将任务元数据结构发送到处理器的调度单元，将任务元数据结构存储在包括在该处理器中的高速缓存单元中调度单元，以及将任务元数据结构从高速缓存单元复制到存储器。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification