Patent search ap:("NVIDIA Corporation") AND inv:"Brett W. Coon" Page 1

1.

发明授权
Programmable graphics processor for multithreaded execution of programs 有权

公开(公告)号：US09659339B2

公开(公告)日：2017-05-23

申请号：US13850175

申请日：2013-03-25

Applicant: NVIDIA Corporation

Inventor： John Erik Lindholm , Brett W. Coon , Stuart F. Oberman , Ming Y. Siu , Matthew P. Gerlach

IPC: G06T1/20 , G06F9/38

CPC classification number: G06T1/20 , G06F9/38 , G06F9/3851

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

2.

发明授权
Cooperative thread array reduction and scan operations 有权

公开(公告)号：US09417875B2

公开(公告)日：2016-08-16

申请号：US14025482

申请日：2013-09-12

Applicant: NVIDIA Corporation

Inventor： Brian Fahs , Ming Y. Siu , Brett W. Coon , John R. Nickolls , Lars Nyland

IPC: G06F9/30 , G06F15/00 , G06F9/38 , G06F9/52

CPC classification number: G06F9/522 , G06F8/458 , G06F9/3004 , G06F9/30087 , G06F9/30145 , G06F9/3851

Abstract: One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread.

3.

发明申请
INSTRUCTIONS FOR MANAGING A PARALLEL CACHE HIERARCHY 审中-公开

公开(公告)号：US20170235581A1

公开(公告)日：2017-08-17

申请号：US15583258

申请日：2017-05-01

Applicant: NVIDIA Corporation

Inventor： John R. NICKOLLS , Brett W. Coon , Michael C. Shebanow

IPC: G06F9/38 , G06F9/30 , G06F12/0897 , G06F12/0875

CPC classification number: G06F9/3887 , G06F9/30043 , G06F9/3009 , G06F9/3836 , G06F12/0811 , G06F12/0862 , G06F12/0871 , G06F12/0875 , G06F12/0897 , G06F12/121 , G06F2212/452

Abstract: A technique for managing a parallel cache hierarchy that includes receiving an instruction from a scheduler unit, where the instruction comprises a load instruction or a store instruction; determining that the instruction includes a cache operations modifier that identifies a policy for caching data associated with the instruction at one or more levels of the parallel cache hierarchy; and executing the instruction and caching the data associated with the instruction based on the cache operations modifier.

4.

发明申请
COOPERATIVE THREAD ARRAY REDUCTION AND SCAN OPERATIONS 有权
Title translation: 合作螺线减排和扫描作业

公开(公告)号：US20160357560A1

公开(公告)日：2016-12-08

申请号：US15238428

申请日：2016-08-16

Applicant: NVIDIA Corporation

Inventor： Brian FAHS , Ming Y. SIU , Brett W. Coon , John R. NICKOLLS , Lars NYLAND

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/522 , G06F8/458 , G06F9/3004 , G06F9/30087 , G06F9/30145 , G06F9/3851

Abstract: One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread.

Abstract translation: 本发明的一个实施例提出了一种用于跨独立执行的多个线程执行聚合操作的技术。聚合被指定为屏障同步或屏障到达指令的一部分，其中除了执行屏障同步或到达之外，指令聚合（使用缩减或扫描操作）由每个线程提供的值。当线程执行屏障聚合指令时，线程有助于扫描或缩小结果，并等待执行任何更多指令，直到所有线程都执行了阻挡聚合指令为止。在所有线程执行了屏障聚合指令之后，向每个线程传递减少结果，并且当线程执行屏障聚合指令时，将扫描结果传送给每个线程。

5.

发明申请
INDIRECT FUNCTION CALL INSTRUCTIONS IN A SYNCHRONOUS PARALLEL THREAD PROCESSOR 有权
Title translation: 同步并行线程处理器中的间接功能调用指令

公开(公告)号：US20130138926A1

公开(公告)日：2013-05-30

申请号：US13674890

申请日：2012-11-12

Applicant: NVIDIA CORPORATION

Inventor： Brett W. Coon , John R. Nickolls , Lars Nyland , Peter C. Mills , John Erik Lindholm

IPC: G06F9/38

CPC classification number: G06F9/38 , G06F9/30054 , G06F9/30101 , G06F9/3851 , G06F9/3885

Abstract: An indirect branch instruction takes an address register as an argument in order to provide indirect function call capability for single-instruction multiple-thread (SIMT) processor architectures. The indirect branch instruction is used to implement indirect function calls, virtual function calls, and switch statements to improve processing performance compared with using sequential chains of tests and branches.

Abstract translation: 间接分支指令将地址寄存器作为参数，以便为单指令多线程（SIMT）处理器架构提供间接函数调用能力。间接分支指令用于实现间接函数调用，虚函数调用和switch语句，以提高处理性能，与使用连续的测试和分支链相比。

6.

发明授权
Programmable graphics processor for multithreaded execution of programs 有权

公开(公告)号：US10217184B2

公开(公告)日：2019-02-26

申请号：US15603294

申请日：2017-05-23

Applicant: NVIDIA Corporation

Inventor： John Erik Lindholm , Brett W. Coon , Stuart F. Oberman , Ming Y. Siu , Matthew P. Gerlach

IPC: G06T1/20 , G06F9/38

Abstract: A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

7.

发明授权
Instructions for managing a parallel cache hierarchy 有权

公开(公告)号：US10365930B2

公开(公告)日：2019-07-30

申请号：US15583258

申请日：2017-05-01

Applicant: NVIDIA Corporation

Inventor： John R. Nickolls , Brett W. Coon , Michael C. Shebanow

IPC: G06F12/00 , G06F13/00 , G06F13/28 , G06F9/38 , G06F12/0811 , G06F12/0862 , G06F12/121 , G06F9/30 , G06F12/0875 , G06F12/0897 , G06F12/0871

Abstract: A technique for managing a parallel cache hierarchy that includes receiving an instruction from a scheduler unit, where the instruction comprises a load instruction or a store instruction; determining that the instruction includes a cache operations modifier that identifies a policy for caching data associated with the instruction at one or more levels of the parallel cache hierarchy; and executing the instruction and caching the data associated with the instruction based on the cache operations modifier.

8.

发明授权
Cooperative thread array reduction and scan operations 有权

公开(公告)号：US09830197B2

公开(公告)日：2017-11-28

申请号：US15238428

申请日：2016-08-16

Applicant: NVIDIA CORPORATION

Inventor： Brian Fahs , Ming Y Siu , Brett W. Coon , John R. Nickolls , Lars Nyland

IPC: G06F9/30 , G06F9/52 , G06F9/38 , G06F9/45

CPC classification number: G06F9/522 , G06F8/458 , G06F9/3004 , G06F9/30087 , G06F9/30145 , G06F9/3851

Abstract: One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread.

9.

发明授权
Indirect function call instructions in a synchronous parallel thread processor 有权

公开(公告)号：US09639365B2

公开(公告)日：2017-05-02

申请号：US13674890

申请日：2012-11-12

Applicant: NVIDIA Corporation

Inventor： Brett W. Coon , John R. Nickolls , Lars Nyland , Peter C. Mills , John Erik Lindholm

IPC: G06F15/00 , G06F7/38 , G06F9/00 , G06F9/44 , G06F9/38 , G06F9/30

CPC classification number: G06F9/38 , G06F9/30054 , G06F9/30101 , G06F9/3851 , G06F9/3885

Abstract: An indirect branch instruction takes an address register as an argument in order to provide indirect function call capability for single-instruction multiple-thread (SIMT) processor architectures. The indirect branch instruction is used to implement indirect function calls, virtual function calls, and switch statements to improve processing performance compared with using sequential chains of tests and branches.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification