Patent search ap:("NVIDIA CORPORATION") AND inv:"Stephen Jones" Page 1

1.

发明申请
TECHNIQUE FOR COMPUTATIONAL NESTED PARALLELISM 有权

公开(公告)号：US20210349763A1

公开(公告)日：2021-11-11

申请号：US17169283

申请日：2021-02-05

Applicant: NVIDIA Corporation

Inventor： Stephen Jones , Philip Alexander Cuadra , Daniel Elliot Wexler , Ignacio Llamas , Lacky V. Shah , Jerome F. Duluk, JR. , Christopher Lamb

IPC: G06F9/50 , G06T1/20 , G06F9/52

Abstract: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.

2.

发明授权
Persistent scratchpad memory for data exchange between programs 有权

公开(公告)号：US10725837B1

公开(公告)日：2020-07-28

申请号：US16677503

申请日：2019-11-07

Applicant: NVIDIA Corporation

Inventor： Rajballav Dash , Jack H. Choquette , Ming Liang Milton Lei , Stephen Jones , Christopher Frederick Lamb

IPC: G06F3/00 , G06F9/54 , G06F9/52 , G06F9/50 , G06F9/48

Abstract: Techniques are disclosed for sharing of data exchange among kernels (each a set of instructions) executing on a system having multiple processing units. In an embodiment, each processing unit includes an on-chip scratchpad memory that can be accessed by the kernels executing on the processing unit. All or a portion of the scratchpad memory can be allocated and configured, for example, such that the scratchpad is accessible to multiple kernels in parallel, to one or more kernels in serial, or a combination of both.

3.

发明申请
TECHNIQUE FOR COMPUTATIONAL NESTED PARALLELISM 审中-公开

公开(公告)号：US20200151016A1

公开(公告)日：2020-05-14

申请号：US16746714

申请日：2020-01-17

Applicant: NVIDIA Corporation

Inventor： Stephen Jones , Philip Alexander Cuadra , Daniel Elliot Wexler , Ignacio Llamas , Lacky V. Shah , Jerome F. Duluk, Jr. , Christopher Lamb

IPC: G06F9/50 , G06F9/52 , G06T1/20

Abstract: One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.

4.

发明授权
Software development environment and method of compiling integrated source code 有权

公开(公告)号：US09971576B2

公开(公告)日：2018-05-15

申请号：US14085649

申请日：2013-11-20

Applicant: Nvidia Corporation

Inventor： Stephen Jones , Mark Hairgrove , Jaydeep Marathe , Vivek Kini , Bastiaan Aarts

IPC: G06F8/30 , G06F8/41 , G06F9/45 , G06F9/44

CPC classification number: G06F8/41 , G06F8/30

Abstract: A software development environment (SDE) and a method of compiling integrated source code. One embodiment of the SDE includes: (1) a parser configured to partition an integrated source code into a host code partition and a device code partition, the host code partition including a reference to a device variable, (2) a translator configured to: (2a) embed device machine code, compiled based on the device code partition, into a modified host code, (2b) define a pointer in the modified host code configured to be initialized, upon execution of the integrated source code, to a memory address allocated to the device variable, and (2c) replace the reference with a dereference to the pointer, and (3) a host compiler configured to employ a host library to compile the modified host code.

5.

发明申请
UNIFIED MEMORY SYSTEMS AND METHODS 审中-公开

公开(公告)号：US20180018750A1

公开(公告)日：2018-01-18

申请号：US15709397

申请日：2017-09-19

Applicant: NVIDIA Corporation

Inventor： Stephen Jones , Vivek Kini , Piotr Jaroszynski , Mark Hairgrove , David Fontaine , Cameron Buschardt , Lucien Dunning , John Hubbard

IPC: G06T1/20 , G06F9/50 , G06T1/60

Abstract: The present invention facilitates efficient and effective utilization of unified virtual addresses across multiple components. In one exemplary implementation, an address allocation process comprises: establishing space for managed pointers across a plurality of memories, including allocating one of the managed pointers with a first portion of memory associated with a first one of a plurality of processors; and performing a process of automatically managing accesses to the managed pointers across the plurality of processors and corresponding memories. The automated management can include ensuring consistent information associated with the managed pointers is copied from the first portion of memory to a second portion of memory associated with a second one of the plurality of processors based upon initiation of an accesses to the managed pointers from the second one of the plurality of processors.

6.

发明公开
GRAPH MODIFICATION 审中-公开

公开(公告)号：US20240168799A1

公开(公告)日：2024-05-23

申请号：US17991657

申请日：2022-11-21

Applicant: NVIDIA Corporation

Inventor： David Fontaine , Houston Thompson Hoffman , Arslan Zulfiqar , Stephen Jones , James Dinan , Jiri Johannes Kraus

IPC: G06F9/48 , G06F8/41

CPC classification number: G06F9/4881 , G06F8/433

Abstract: Apparatuses, systems, and techniques to modify graphs. In at least one embodiment, a processor comprises one or more circuits to modify an execution order of at least one graph portion.

7.

发明申请
SYSTEM AND METHOD OF CONTROLLING CACHE MEMORY RESIDENCY 有权

公开(公告)号：US20220365882A1

公开(公告)日：2022-11-17

申请号：US17395255

申请日：2021-08-05

Applicant: NVIDIA Corporation

Inventor： Harold Carter Edwards , Luke David Durant , Stephen Jones , Jack H. Choquette , Ronny Krashinsky , Dmitri Vainbrand , Olivier Giroux , Olivier Francois Joseph Harel , Shirish Gadre , Ze Long , Matthieu Tardy , David Dastous St Hilaire , Gokul Ramaswamy Hirisave Chandra Shekhara , Jaydeep Marathe , Jaewook Shin , Jayashree Venkatesh , Girish Bhaskar Bharambe

IPC: G06F12/0895

Abstract: Apparatuses, systems, and techniques to control operation of a memory cache. In at least one embodiment, cache guidance is specified within application source code by associating guidance with declaration of a memory block, and then applying specified guidance to source code statements that access said memory block.

8.

发明申请
SOFTWARE DEVELOPMENT ENVIRONMENT AND METHOD OF COMPILING INTEGRATED SOURCE CODE 有权
Title translation: 软件开发环境和编译集成源代码的方法

公开(公告)号：US20150143347A1

公开(公告)日：2015-05-21

申请号：US14085649

申请日：2013-11-20

Applicant: NVIDIA CORPORATION

Inventor： Stephen Jones , Mark Hairgrove , Jaydeep Marathe , Vivek Kini , Bastiaan Aarts

IPC: G06F9/45

CPC classification number: G06F8/41 , G06F8/30

Abstract: A software development environment (SDE) and a method of compiling integrated source code. One embodiment of the SDE includes: (1) a parser configured to partition an integrated source code into a host code partition and a device code partition, the host code partition including a reference to a device variable, (2) a translator configured to: (2a) embed device machine code, compiled based on the device code partition, into a modified host code, (2b) define a pointer in the modified host code configured to be initialized, upon execution of the integrated source code, to a memory address allocated to the device variable, and (2c) replace the reference with a dereference to the pointer, and (3) a host compiler configured to employ a host library to compile the modified host code.

Abstract translation: 软件开发环境（SDE）和编译集成源代码的方法。 SDE的一个实施例包括：（1）被配置为将集成源代码分割成主机代码分区和设备代码分区的解析器，所述主机代码分区包括对设备变量的引用，（2）翻译器，被配置为：（2a）将基于所述设备代码分区编译的设备机器码嵌入修改的主机代码中，（2b）在执行所述集成源代码时将修改后的主机代码中的指针定义为被初始化的内容地址分配给设备变量，并且（2c）将引用替换为指针的取消引用，以及（3）配置为使用主机库来编译修改的主机代码的主机编译器。

9.

发明授权
Pre-fetching task descriptors of dependent tasks 有权

公开(公告)号：US11182207B2

公开(公告)日：2021-11-23

申请号：US16450508

申请日：2019-06-24

Applicant: NVIDIA CORPORATION

Inventor： Gentaro Hirota , Brian Pharris , Jeff Tuckey , Robert Overman , Stephen Jones

IPC: G06F9/48 , G06F9/52

Abstract: Techniques are disclosed for reducing the latency between the completion of a producer task and the launch of a consumer task dependent on the producer task. Such latency exists when the information needed to launch the consumer task is unavailable when the producer task completes. Thus, various techniques are disclosed, where a task management unit initiates the retrieval of the information needed to launch the consumer task from memory in parallel with the producer task being launched. Because the retrieval of such information is initiated in parallel with the launch of the producer task, the information is often available when the producer task completes, thus allowing for the consumer task to be launched without delay. The disclosed techniques, therefore, enable the latency between completing the producer task and launching the consumer task to be reduced.

10.

发明授权
Technique for computational nested parallelism 有权

公开(公告)号：US10915364B2

公开(公告)日：2021-02-09

申请号：US15368434

申请日：2016-12-02

Applicant: Nvidia Corporation

Inventor： Stephen Jones , Philip Alexander Cuadra , Daniel Elliot Wexler , Ignacio Llamas , Lacky V. Shah , Jerome F. Duluk , Christopher Lamb

IPC: G06F9/46 , G06F9/50 , G06T1/20 , G06F9/52

Abstract: Apparatuses, systems, and techniques for performing nested kernel execution within a parallel processing subsystem. In at least one embodiment, a parent thread launches a nested child grid on the parallel processing subsystem, and enables the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification