专利检索 ap:("John Erik Lindholm" OR "Brett W. Coon" OR "Gary M. Tarolli") AND inv:"John Erik Lindholm" 第 1 页

1.

发明授权
Subdividing a shader program 有权
标题翻译：细分着色程序

公开(公告)号：US08159496B1

公开(公告)日：2012-04-17

申请号：US12476137

申请日：2009-06-01

申请人： John Erik Lindholm , Brett W. Coon , Gary M Tarolli

发明人： John Erik Lindholm , Brett W. Coon , Gary M Tarolli

IPC分类号： G06T1/20 , G06T1/00 , G06F15/80 , G09G5/00

CPC分类号： G06T1/60 , G06F8/4442 , G06F9/3834 , G06F9/3851 , G06F9/3885

摘要： Methods and apparatus for subdividing a shader program into regions or “phases” of instructions identifiable by phase identifiers (IDs) inserted into the shader program are provided. The phase IDs may be used to constrain execution of the shader program to prohibit texture fetches in later phases from being executed before a texture fetch in a current phase has completed. Other operations (e.g., math operations) within the current phase, however, may be allowed to execute while waiting for the current phase texture fetch to complete.

摘要翻译： 提供了将着色器程序细分为通过插入到着色器程序中的相位标识符（ID）可识别的指令的区域或“阶段”的方法和装置。相位ID可以用于限制着色器程序的执行，以便在当前阶段的纹理提取完成之前禁止稍后阶段中的纹理提取被执行。然而，当前阶段的其他操作（例如，数学运算）可以在等待当前相位纹理提取完成的同时执行。

2.

发明授权
Scheduler in multi-threaded processor prioritizing instructions passing qualification rule 有权
标题翻译：多线程处理器调度器优先级指令通过资格规则

公开(公告)号：US07949855B1

公开(公告)日：2011-05-24

申请号：US12110942

申请日：2008-04-28

申请人： Peter C. Mills , John Erik Lindholm , Brett W. Coon , Gary M. Tarolli , John Matthew Burgess

发明人： Peter C. Mills , John Erik Lindholm , Brett W. Coon , Gary M. Tarolli , John Matthew Burgess

IPC分类号： G06F9/38

CPC分类号： G06F9/3851 , G06F9/3838 , G06F9/3853 , G06F9/3867 , G06F9/3885

摘要： A processor buffers asynchronous threads. Instructions requiring operations provided by a plurality of execution units are divided into phases, each phase having at least one computation operation and at least one memory access operation. Instructions within each phase are qualified and prioritized. The instructions may be qualified based on the status of the execution unit needed to execute one or more of the current instructions. The instructions may also be qualified based on an age of each instruction, status of the execution units, a divergence potential, locality, thread diversity, and resource requirements. Qualified instructions may be prioritized based on execution units needed to execute instructions and the execution units in use. One or more of the prioritized instructions is issued per cycle to the plurality of execution units.

摘要翻译： 处理器缓冲异步线程。由多个执行单元提供的需要操作的指令被划分为相位，每个阶段具有至少一个计算操作和至少一个存储器访问操作。每个阶段的说明是合格的，并且是优先考虑的。可以基于执行一个或多个当前指令所需的执行单元的状态来限制指令。指令也可以基于每个指令的年龄，执行单元的状态，发散电位，局部性，线程分集和资源需求来限定。可以根据执行指令所需的执行单元和正在使用的执行单元来优先确定合格的指令。每个周期向多个执行单元发出一个或多个优先指令。

3.

发明授权
Subdividing a shader program 有权
标题翻译：细分着色程序

公开(公告)号：US07542043B1

公开(公告)日：2009-06-02

申请号：US11136346

申请日：2005-05-23

申请人： John Erik Lindholm , Brett W. Coon , Gary M. Tarolli

发明人： John Erik Lindholm , Brett W. Coon , Gary M. Tarolli

IPC分类号： G06T1/20 , G06F5/80

CPC分类号： G06T1/60 , G06F8/4442 , G06F9/3834 , G06F9/3851 , G06F9/3885

摘要： Methods and apparatus for subdividing a shader program into regions or “phases” of instructions identifiable by phase identifiers (IDs) inserted into the shader program are provided. The phase IDs may be used to constrain execution of the shader program to prohibit texture fetches in later phases from being executed before a texture fetch in a current phase has completed. Other operations (e.g., math operations) within the current phase, however, may be allowed to execute while waiting for the current phase texture fetch to complete.

摘要翻译： 提供了将着色器程序细分为通过插入到着色器程序中的相位标识符（ID）可识别的指令的区域或“阶段”的方法和装置。相位ID可以用于限制着色器程序的执行，以便在当前阶段的纹理提取完成之前禁止稍后阶段中的纹理提取被执行。然而，当前阶段的其他操作（例如，数学运算）可以在等待当前相位纹理提取完成的同时执行。

4.

发明授权
Scheduling instructions from multi-thread instruction buffer based on phase boundary qualifying rule for phases of math and data access operations with better caching 有权
标题翻译：基于具有更好缓存的数学和数据访问操作阶段的相位边界限定规则的多线程指令缓冲区的调度指令

公开(公告)号：US07366878B1

公开(公告)日：2008-04-29

申请号：US11404196

申请日：2006-04-13

申请人： Peter C. Mills , John Erik Lindholm , Brett W. Coon , Gary M. Tarolli , John Matthew Burgess

发明人： Peter C. Mills , John Erik Lindholm , Brett W. Coon , Gary M. Tarolli , John Matthew Burgess

IPC分类号： G06F9/50

CPC分类号： G06F9/3851 , G06F9/3838 , G06F9/3853 , G06F9/3867 , G06F9/3885

摘要： A processor buffers asynchronous threads. Current instructions requiring operations provided by a plurality of execution units are divided into phases, each phase having at least one math operation and at least one texture cache access operation. Instructions within each phase are qualified and prioritized, with texture cache access operations in a subsequent phase not qualified until all of the texture cache access operations in a current phase have completed. The instructions may be qualified based on the status of the execution unit needed to execute one or more of the instructions. The instructions may also be qualified based on an age of each instruction, a divergence potential, locality, thread diversity, and resource requirements. Qualified instructions may be prioritized based on execution units needed to execute current instructions and the execution units in use. One or more of the prioritized instructions is issued per cycle to the plurality of execution units.

摘要翻译： 处理器缓冲异步线程。由多个执行单元提供的需要操作的当前指令被划分为相位，每个相位具有至少一个数学运算和至少一个纹理高速缓存存取操作。每个阶段内的指令都是合格的并且是优先级排序的，后续阶段的纹理高速缓存访问操作在当前阶段的所有纹理缓存访问操作都已经完成之前不合格。可以基于执行一个或多个指令所需的执行单元的状态来限制指令。指令也可以根据每个指令的年龄，分歧潜力，局部性，线程分集和资源需求进行限定。可以根据执行当前指令所需的执行单元和正在使用的执行单元，优先考虑合格的指令。每个周期向多个执行单元发出一个或多个优先指令。

5.

发明授权
Thread group scheduler for computing on a parallel thread processor 有权
标题翻译：线程组调度程序，用于在并行线程处理器上进行计算

公开(公告)号：US08732713B2

公开(公告)日：2014-05-20

申请号：US13247819

申请日：2011-09-28

申请人： Brett W. Coon , John Erik Lindholm , Robert J. Stoll , Nicholas Wang , Jack Hilaire Choquette , Kathleen Elliott Nickolls

发明人： Brett W. Coon , John R. Nickolls , John Erik Lindholm , Robert J. Stoll , Nicholas Wang , Jack Hilaire Choquette

IPC分类号： G06F9/46

CPC分类号： G06F9/4881 , G06F2209/483

摘要： A parallel thread processor executes thread groups belonging to multiple cooperative thread arrays (CTAs). At each cycle of the parallel thread processor, an instruction scheduler selects a thread group to be issued for execution during a subsequent cycle. The instruction scheduler selects a thread group to issue for execution by (i) identifying a pool of available thread groups, (ii) identifying a CTA that has the greatest seniority value, and (iii) selecting the thread group that has the greatest credit value from within the CTA with the greatest seniority value.

摘要翻译： 并行线程处理器执行属于多个协作线程数组（CTA）的线程组。在并行线程处理器的每个周期，指令调度器在随后的周期中选择要发行的线程组以执行。指令调度器通过（i）识别可用线程组的池，（ii）识别具有最大资历值的CTA来选择要执行的线程组，以及（iii）选择具有最大信用值的线程组从具有最高资历价值的CTA内。

6.

发明授权
Processing an indirect branch instruction in a SIMD architecture 有权
标题翻译：在SIMD架构中处理间接分支指令

公开(公告)号：US07761697B1

公开(公告)日：2010-07-20

申请号：US11557082

申请日：2006-11-06

申请人： Brett W. Coon , John Erik Lindholm , Peter C. Mills , John R. Nickolls

发明人： Brett W. Coon , John Erik Lindholm , Peter C. Mills , John R. Nickolls

IPC分类号： G06F7/38 , G06F9/00 , G06F9/44

CPC分类号： G06F9/30072 , G06F9/3009 , G06F9/30185 , G06F9/322 , G06F9/3851 , G06F9/3887

摘要： One embodiment of a computing system configured to manage divergent threads in a thread group includes a stack configured to store at least one token and a multithreaded processing unit. The multithreaded processing unit is configured to perform the steps of fetching a program instruction, determining that the program instruction is an indirect branch instruction, and processing the indirect branch instruction as a sequence of two-way branches to execute an indirect branch instruction with multiple branch addresses. Indirect branch instructions may be used to allow greater flexibility since the branch address or multiple branch addresses do not need to be determined at compile time.

摘要翻译： 被配置为管理线程组中的发散线程的计算系统的一个实施例包括配置成存储至少一个令牌和多线程处理单元的堆栈。多线程处理单元被配置为执行以下步骤：获取程序指令，确定程序指令是间接分支指令，以及将间接分支指令处理为双向分支序列，以执行具有多个分支的间接分支指令地址可以使用间接分支指令来允许更大的灵活性，因为在编译时不需要确定分支地址或多个分支地址。

7.

发明授权
Structured programming control flow using a disable mask in a SIMD architecture 有权
标题翻译：在SIMD架构中使用禁用掩码的结构化编程控制流程

公开(公告)号：US07617384B1

公开(公告)日：2009-11-10

申请号：US11669513

申请日：2007-01-31

申请人： Brett W. Coon , John Erik Lindholm , Svetoslav D. Tzvetkov

发明人： Brett W. Coon , John Erik Lindholm , Svetoslav D. Tzvetkov

IPC分类号： G06F15/80

CPC分类号： G06F9/3851 , G06F9/30072 , G06F9/3885 , G06F9/3887

摘要： One embodiment of a computing system configured to manage divergent threads in a SIMD thread group includes a stack configured to store state information for processing control instructions. A parallel processing unit is configured to perform the steps of determining if one or more threads diverge during execution of a conditional control instruction. Threads that exit a program are identified as idle by a disable mask. Other threads that are disabled may be enabled once the divergent threads reach an instruction that enables the disabled threads. Use of the disable mask allows for the use of conditional return and break instructions in a multithreaded SIMD architecture.

摘要翻译： 被配置为管理SIMD线程组中的发散线程的计算系统的一个实施例包括被配置为存储用于处理控制指令的状态信息的堆栈。并行处理单元被配置为执行在执行条件控制指令期间确定一个或多个线程是否发散的步骤。退出程序的线程被禁用掩码标识为空闲。禁用的其他线程可以在分支线程达到启用禁用线程的指令后启用。禁用掩码的使用允许在多线程SIMD架构中使用条件返回和中断指令。

8.

发明申请
Indirect Function Call Instructions in a Synchronous Parallel Thread Processor 有权
标题翻译：同步并行线程处理器中的间接函数调用指令

公开(公告)号：US20090240931A1

公开(公告)日：2009-09-24

申请号：US12054255

申请日：2008-03-24

申请人： Brett W. Coon , John R. Nickolls , Lars Nyland , Peter C. Mills , John Erik Lindholm

发明人： Brett W. Coon , John R. Nickolls , Lars Nyland , Peter C. Mills , John Erik Lindholm

IPC分类号： G06F9/38

CPC分类号： G06F9/38 , G06F9/30054 , G06F9/30101 , G06F9/3851 , G06F9/3885

摘要： An indirect branch instruction takes an address register as an argument in order to provide indirect function call capability for single-instruction multiple-thread (SIMT) processor architectures. The indirect branch instruction is used to implement indirect function calls, virtual function calls, and switch statements to improve processing performance compared with using sequential chains of tests and branches.

摘要翻译： 间接分支指令将地址寄存器作为参数，以便为单指令多线程（SIMT）处理器架构提供间接函数调用能力。间接分支指令用于实现间接函数调用，虚函数调用和switch语句，以提高处理性能，与使用连续的测试和分支链相比。

9.

发明授权
Credit-based streaming multiprocessor warp scheduling 有权
标题翻译：基于信用流的多处理器扭曲调度

公开(公告)号：US09189242B2

公开(公告)日：2015-11-17

申请号：US12885299

申请日：2010-09-17

申请人： John Erik Lindholm , Brett W. Coon , Jered Wierzbicki , Robert J. Stoll , Stuart F. Oberman

发明人： John Erik Lindholm , Brett W. Coon , Jered Wierzbicki , Robert J. Stoll , Stuart F. Oberman

IPC分类号： G06F9/50 , G06F9/38

CPC分类号： G06F9/3851 , G06F9/3836 , G06F9/3885 , G06F9/3887 , G06F9/3889

摘要： One embodiment of the present invention sets forth a technique for ensuring cache access instructions are scheduled for execution in a multi-threaded system to improve cache locality and system performance. A credit-based technique may be used to control instruction by instruction scheduling for each warp in a group so that the group of warps is processed uniformly. A credit is computed for each warp and the credit contributes to a weight for each warp. The weight is used to select instructions for the warps that are issued for execution.

摘要翻译： 本发明的一个实施例提出了一种用于确保高速缓存访问指令被调度用于在多线程系统中执行以提高高速缓存位置和系统性能的技术。可以使用基于信用的技术来对组中的每个翘曲的指令调度来控制指令，使得一组经线被均匀地处理。对每个经纱计算信用额度，并且信用额度有助于每个经线的权重。权重用于选择要执行的经纱的说明。

10.

发明授权
Programmable graphics processor for multithreaded execution of programs 有权
标题翻译：用于多线程执行程序的可编程图形处理器

公开(公告)号：US08405665B2

公开(公告)日：2013-03-26

申请号：US13466043

申请日：2012-05-07

申请人： John Erik Lindholm , Brett W. Coon , Stuart F. Oberman , Ming Y. Siu , Matthew P. Gerlach

发明人： John Erik Lindholm , Brett W. Coon , Stuart F. Oberman , Ming Y. Siu , Matthew P. Gerlach

IPC分类号： G06F15/16 , G06F15/80 , G06F13/14 , G06T1/20

CPC分类号： G06T15/005

摘要： A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.

摘要翻译： 处理单元包括多个执行流水线，每个执行流水线连接到第一输入部分，用于接收用于像素处理的输入数据和用于接收用于顶点处理的输入数据的第二输入部分和用于存储经处理的像素数据的第一输出部分和用于存储经处理的顶点数据的第二输出部分。经处理的顶点数据被光栅化并扫描转换为用作像素处理的输入数据的像素数据。经处理的像素数据被输出到光栅分析器。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类