Patent search ap:("NVIDIA CORPORATION") AND inv:"Luke DURANT" Page 1

1.

发明公开
Distributed Shared Memory 审中-公开

公开(公告)号：US20230289189A1

公开(公告)日：2023-09-14

申请号：US17691690

申请日：2022-03-10

Applicant: NVIDIA Corporation

Inventor： Prakash BANGALORE PRABHAKAR , Gentaro HIROTA , Ronny KRASHINSKY , Ze LONG , Brian PHARRIS , Rajballav DASH , Jeff TUCKEY , Jerome F. DULUK, JR. , Lacky SHAH , Luke DURANT , Jack CHOQUETTE , Eric WERNESS , Naman GOVIL , Manan PATEL , Shayani DEB , Sandeep NAVADA , John EDMONDSON , Greg PALMER , Wish GANDHI , Ravi MANYAM , Apoorv PARLE , Olivier GIROUX , Shirish GADRE , Steve HEINRICH

IPC: G06F3/06

CPC classification number: G06F3/064 , G06F3/0604 , G06F3/0679

Abstract: Distributed shared memory (DSMEM) comprises blocks of memory that are distributed or scattered across a processor (such as a GPU). Threads executing on a processing core local to one memory block are able to access a memory block local to a different processing core. In one embodiment, shared access to these DSMEM allocations distributed across a collection of processing cores is implemented by communications between the processing cores. Such distributed shared memory provides very low latency memory access for processing cores located in proximity to the memory blocks, and also provides a way for more distant processing cores to also access the memory blocks in a manner and using interconnects that do not interfere with the processing cores' access to main or global memory such as hacked by an L2 cache. Such distributed shared memory supports cooperative parallelism and strong scaling across multiple processing cores by permitting data sharing and communications previously possible only within the same processing core.

2.

发明申请
COOPERATIVE THREAD ARRAY GRANULARITY CONTEXT SWITCH DURING TRAP HANDLING 有权
Title translation: 跟踪处理期间的合作螺旋线阵列格局开关

公开(公告)号：US20170010914A1

公开(公告)日：2017-01-12

申请号：US15271171

申请日：2016-09-20

Applicant: NVIDIA Corporation

Inventor： Gerald F. LUIZ , Philip Alexander CUADRA , Luke DURANT , Shirish GADRE , Robert OHANNESSIAN , Lacky V. SHAH , Nicholas Wang , Arthur Merlin DANSKIN

IPC: G06F9/46 , G06F9/48

CPC classification number: G06F9/461 , G06F9/4812 , G06F9/485

Abstract: Techniques are provided for restoring threads within a processing core. The techniques include, for a first thread group included in a plurality of thread groups, executing a context restore routine to restore from a memory a first portion of a context associated with the first thread group, determining whether the first thread group completed an assigned function, and, if the first thread group completed the assigned function, then exiting the context restore routine, or if the first thread group did not complete the assigned function, then executing one or more operations associated with a trap handler routine.

Abstract translation: 提供了用于恢复处理核心内的线程的技术。这些技术包括对于包括在多个线程组中的第一线程组，执行上下文恢复例程以从存储器恢复与第一线程组相关联的上下文的第一部分，确定第一线程组是否完成了分配的功能，并且如果第一个线程组完成了分配的函数，则退出上下文恢复例程，或者如果第一个线程组未完成分配的函数，则执行与陷阱处理程序例程相关联的一个或多个操作。

3.

发明申请
TECHNIQUE FOR SAVING AND RESTORING THREAD GROUP OPERATING STATE 审中-公开
Title translation: 节省和恢复螺纹组操作状态的技术

公开(公告)号：US20140165072A1

公开(公告)日：2014-06-12

申请号：US13711093

申请日：2012-12-11

Applicant: NVIDIA CORPORATION

Inventor： Nicholas WANG , Lacky V. SHAH , Gerald F. LUIZ , Philip Alexander CUADRA , Luke DURANT , Shirish GADRE

IPC: G06F9/50

CPC classification number: G06F9/5016 , G06F9/461

Abstract: A streaming multiprocessor (SM) included within a parallel processing unit (PPU) is configured to suspend a thread group executing on the SM and to save the operating state of the suspended thread group. A load-store unit (LSU) within the SM re-maps local memory associated with the thread group to a location in global memory. Subsequently, the SM may re-launch the suspended thread group. The LSU may then perform local memory access operations on behalf of the re-launched thread group with the re-mapped local memory that resides in global memory.

Abstract translation: 包括在并行处理单元（PPU）内的流多处理器（SM）被配置为暂停在SM上执行的线程组，并且保存挂起的线程组的操作状态。 SM内的加载存储单元（LSU）将与线程组相关联的本地存储器映射到全局存储器中的位置。随后，SM可以重新启动挂起的线程组。然后，LSU可以使用驻留在全局存储器中的重新映射的本地存储器代表重新启动的线程组来执行本地存储器访问操作。

4.

发明申请
COOPERATIVE THREAD ARRAY GRANULARITY CONTEXT SWITCH DURING TRAP HANDLING 审中-公开

公开(公告)号：US20180052707A1

公开(公告)日：2018-02-22

申请号：US15798174

申请日：2017-10-30

Applicant: NVIDIA Corporation

Inventor： Gerald F. LUIZ , Philip Alexander CUADRA , Luke DURANT , Shirish GADRE , Robert OHANNESSIAN , Lacky V. SHAH , Nicholas Wang , Arthur Merlin DANSKIN

IPC: G06F9/46 , G06F9/48

CPC classification number: G06F9/461 , G06F9/4812 , G06F9/485

Abstract: Techniques are provided for restoring threads within a processing core. The techniques include, for a first thread group included in a plurality of thread groups, executing a context restore routine to restore from a memory a first portion of a context associated with the first thread group, determining whether the first thread group completed an assigned function, and, if the first thread group completed the assigned function, then exiting the context restore routine, or if the first thread group did not complete the assigned function, then executing one or more operations associated with a trap handler routine.

5.

发明公开
Cooperative Group Arrays 审中-公开

公开(公告)号：US20230289215A1

公开(公告)日：2023-09-14

申请号：US17691621

申请日：2022-03-10

Applicant: NVIDIA Corporation

Inventor： Greg PALMER , Gentaro HIROTA , Ronny KRASHINSKY , Ze LONG , Brian PHARRIS , Rajballav DASH , Jeff TUCKEY , Jerome F. DULUK, JR. , Lacky SHAH , Luke DURANT , Jack CHOQUETTE , Eric WERNESS , Naman GOVIL , Manan PATEL , Shayani DEB , Sandeep NAVADA , John EDMONDSON , Prakash BANGALORE PRABHAKAR , Wish GANDHI , Ravi MANYAM , Apoorv PARLE , Olivier GIROUX , Shirish GADRE , Steve HEINRICH

IPC: G06F9/48 , G06F9/38 , G06F9/30 , G06F9/54

CPC classification number: G06F9/4881 , G06F9/3851 , G06F9/3009 , G06F9/544

Abstract: A new level(s) of hierarchy—Cooperate Group Arrays (CGAs)—and an associated new hardware-based work distribution/execution model is described. A CGA is a grid of thread blocks (also referred to as cooperative thread arrays (CTAs)). CGAs provide co-scheduling, e.g., control over where CTAs are placed/executed in a processor (such as a GPU), relative to the memory required by an application and relative to each other. Hardware support for such CGAs guarantees concurrency and enables applications to see more data locality, reduced latency, and better synchronization between all the threads in tightly cooperating collections of CTAs programmably distributed across different (e.g., hierarchical) hardware domains or partitions.

6.

发明申请
SYSTEM AND METHOD FOR RUNTIME SCHEDULING OF GPU TASKS 有权
Title translation: GPU任务运行调度的系统和方法

公开(公告)号：US20140259016A1

公开(公告)日：2014-09-11

申请号：US13787660

申请日：2013-03-06

Applicant: NVIDIA CORPORATION

Inventor： Timothy Paul LOTTES , Daniel WEXLER , Craig DUTTWEILER , Sean TREICHLER , Luke DURANT , Philip CUADRA

IPC: G06F9/54

CPC classification number: G06F9/4881 , G06F2209/484

Abstract: A method for scheduling work for processing by a GPU is disclosed. The method includes accessing a work completion data structure and accessing a work tracking data structure. Dependency logic analysis is then performed using work completion data and work tracking data. Work items that have dependencies are then launched into the GPU by using a software work item launch interface.

Abstract translation: 公开了一种用于由GPU进行处理的调度工作的方法。该方法包括访问工作完成数据结构和访问工作跟踪数据结构。然后使用工作完成数据和工作跟踪数据执行依赖逻辑分析。然后通过使用软件工作项启动界面将具有依赖关系的工作项目启动到GPU中。

7.

发明申请
COOPERATIVE THREAD ARRAY GRANULARITY CONTEXT SWITCH DURING TRAP HANDLING 审中-公开
Title translation: 跟踪处理期间的合作螺旋线阵列格局开关

公开(公告)号：US20140189329A1

公开(公告)日：2014-07-03

申请号：US13728784

申请日：2012-12-27

Applicant: NVIDIA CORPORATION

Inventor： Gerald F. LUIZ , Philip Alexander CUADRA , Luke DURANT , Shirish GADRE , Robert OHANNESSIAN , Lacky V. SHAH , Nicholas WANG , Arthur DANSKIN

IPC: G06F9/38

CPC classification number: G06F9/3851 , G06F9/3861 , G06F9/3887 , G06F9/4812

Abstract: Techniques are provided for handling a trap encountered in a thread that is part of a thread array that is being executed in a plurality of execution units. In these techniques, a data structure with an identifier associated with the thread is updated to indicate that the trap occurred during the execution of the thread array. Also in these techniques, the execution units execute a trap handling routine that includes a context switch. The execution units perform this context switch for at least one of the execution units as part of the trap handling routine while allowing the remaining execution units to exit the trap handling routine before the context switch. One advantage of the disclosed techniques is that the trap handling routine operates efficiently in parallel processors.

Abstract translation: 提供了用于处理在作为在多个执行单元中执行的线程数组的一部分的线程中遇到的陷阱的技术。在这些技术中，具有与线程相关联的标识符的数据结构被更新，以指示在执行线程数组期间发生陷阱。同样在这些技术中，执行单元执行包括上下文切换的陷阱处理例程。执行单元为至少一个执行单元执行该上下文切换，作为陷阱处理例程的一部分，同时允许剩余执行单元在上下文切换之前退出陷阱处理例程。所公开技术的一个优点是陷阱处理例程在并行处理器中有效地操作。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification