专利检索 ap:("Brian Fahs" OR "Ming Y. Siu" OR "Robert Steven Glanville") AND inv:"Brian Fahs" 第 1 页

1.

发明申请
Opcode-Specified Predicatable Warp Post-Synchronization 有权
标题翻译：操作码指定的可预测变形后同步

公开(公告)号：US20110078690A1

公开(公告)日：2011-03-31

申请号：US12892887

申请日：2010-09-28

申请人： Brian Fahs , Ming Y. Siu , Robert Steven Glanville

发明人： Brian Fahs , Ming Y. Siu , Robert Steven Glanville

IPC分类号： G06F9/46

CPC分类号： G06F9/46 , G06F9/30072 , G06F9/30087 , G06F9/30185 , G06F9/3851 , G06F9/3887

摘要： One embodiment of the present invention sets forth a technique for performing a method for synchronizing divergent executing threads. The method includes receiving a plurality of instructions that includes at least one set-synchronization instruction and at least one instruction that includes a synchronization command, and determining an active mask that indicates which threads in a plurality of threads are active and which threads in the plurality of threads are disabled. For each instruction included in the plurality of instructions, the instruction is transmitted to each of the active threads included in the plurality of threads. If the instruction is a set-synchronization instruction, then a synchronization token, the active mask and the synchronization point is each pushed onto a stack. Or, if the instruction is a predicated instruction that includes a synchronization command, then each active thread that executes the predicated instruction is monitored to determine when the active mask has been updated to indicate that each active thread, after executing the predicated instruction, has been disabled.

摘要翻译： 本发明的一个实施例提出了一种用于执行用于同步发散执行线程的方法的技术。该方法包括接收包括至少一个集合同步指令和包括同步命令的至少一个指令的多个指令，以及确定指示多个线程中的哪些线程是活动的活动掩码，以及多个线程中的哪些线程的线程被禁用。对于包括在多个指令中的每个指令，指令被发送到包括在多个线程中的每个活动线程。如果指令是设置同步指令，则将同步令牌，活动掩码和同步点分别压入堆栈。或者，如果指令是包括同步命令的预测指令，则监视执行预测指令的每个活动线程，以确定何时更新活动掩码以指示在执行预定指令之后每个活动线程已被残疾人士

2.

发明授权
Opcode-specified predicatable warp post-synchronization 有权
标题翻译：操作码指定的可预测扭曲后同步

公开(公告)号：US08850436B2

公开(公告)日：2014-09-30

申请号：US12892887

申请日：2010-09-28

申请人： Brian Fahs , Ming Y. Siu , Robert Steven Glanville

发明人： Brian Fahs , Ming Y. Siu , Robert Steven Glanville

IPC分类号： G06F9/46 , G06F9/38 , G06F9/30

CPC分类号： G06F9/46 , G06F9/30072 , G06F9/30087 , G06F9/30185 , G06F9/3851 , G06F9/3887

摘要： One embodiment of the present invention sets forth a technique for performing a method for synchronizing divergent executing threads. The method includes receiving a plurality of instructions that includes at least one set-synchronization instruction and at least one instruction that includes a synchronization command, and determining an active mask that indicates which threads in a plurality of threads are active and which threads in the plurality of threads are disabled. For each instruction included in the plurality of instructions, the instruction is transmitted to each of the active threads included in the plurality of threads. If the instruction is a set-synchronization instruction, then a synchronization token, the active mask and the synchronization point is each pushed onto a stack. Or, if the instruction is a predicated instruction that includes a synchronization command, then each active thread that executes the predicated instruction is monitored to determine when the active mask has been updated to indicate that each active thread, after executing the predicated instruction, has been disabled.

摘要翻译： 本发明的一个实施例提出了一种用于执行用于同步发散执行线程的方法的技术。该方法包括接收包括至少一个集合同步指令和包括同步命令的至少一个指令的多个指令，以及确定指示多个线程中的哪些线程是活动的活动掩码，以及多个线程中的哪些线程的线程被禁用。对于包括在多个指令中的每个指令，指令被发送到包括在多个线程中的每个活动线程。如果指令是设置同步指令，则将同步令牌，活动掩码和同步点分别压入堆栈。或者，如果指令是包括同步命令的预测指令，则监视执行预测指令的每个活动线程，以确定何时更新活动掩码以指示在执行预定指令之后每个活动线程已被残疾人士

3.

发明授权
Cooperative thread array reduction and scan operations 有权
标题翻译：合作线程数组减少和扫描操作

公开(公告)号：US08539204B2

公开(公告)日：2013-09-17

申请号：US12890227

申请日：2010-09-24

申请人： Brian Fahs , Ming Y. Siu , Brett W. Coon , John R. Nickolls , Lars Nyland

发明人： Brian Fahs , Ming Y. Siu , Brett W. Coon , John R. Nickolls , Lars Nyland

IPC分类号： G06F9/30 , G06F9/40 , G06F15/00

CPC分类号： G06F9/522 , G06F8/458 , G06F9/3004 , G06F9/30087 , G06F9/30145 , G06F9/3851

摘要： One embodiment of the present invention sets forth a technique for performing aggregation operations across multiple threads that execute independently. Aggregation is specified as part of a barrier synchronization or barrier arrival instruction, where in addition to performing the barrier synchronization or arrival, the instruction aggregates (using reduction or scan operations) values supplied by each thread. When a thread executes the barrier aggregation instruction the thread contributes to a scan or reduction result, and waits to execute any more instructions until after all of the threads have executed the barrier aggregation instruction. A reduction result is communicated to each thread after all of the threads have executed the barrier aggregation instruction and a scan result is communicated to each thread as the barrier aggregation instruction is executed by the thread.

摘要翻译： 本发明的一个实施例提出了一种用于跨独立执行的多个线程执行聚合操作的技术。聚合被指定为屏障同步或屏障到达指令的一部分，其中除了执行屏障同步或到达之外，指令聚合（使用缩减或扫描操作）由每个线程提供的值。当线程执行屏障聚合指令时，线程有助于扫描或缩小结果，并等待执行任何更多指令，直到所有线程都执行了阻挡聚合指令为止。在所有线程执行了屏障聚合指令之后，向每个线程传送减少结果，并且当线程执行屏障聚合指令时，将扫描结果传送给每个线程。

4.

发明授权
Cache operations and policies for a multi-threaded client 有权

公开(公告)号：US09952977B2

公开(公告)日：2018-04-24

申请号：US12890476

申请日：2010-09-24

申请人： Steven James Heinrich , Alexander L. Minkin , Brett W. Coon , Rajeshwaran Selvanesan , Robert Steven Glanville , Charles McCarver , Anjana Rajendran , Stewart Glenn Carlton , John R. Nickolls , Brian Fahs

发明人： Steven James Heinrich , Alexander L. Minkin , Brett W. Coon , Rajeshwaran Selvanesan , Robert Steven Glanville , Charles McCarver , Anjana Rajendran , Stewart Glenn Carlton , John R. Nickolls , Brian Fahs

IPC分类号： G06F12/00 , G06F12/0842 , G06F12/0897

CPC分类号： G06F12/0842 , G06F12/0897

摘要： A method for managing a parallel cache hierarchy in a processing unit. The method including receiving an instruction that includes a cache operations modifier that identifies a level of the parallel cache hierarchy in which to cache data associated with the instruction; and implementing a cache replacement policy based on the cache operations modifier.

5.

发明申请
Cache Operations and Policies For A Multi-Threaded Client 有权
标题翻译：多线程客户端的缓存操作和策略

公开(公告)号：US20110078381A1

公开(公告)日：2011-03-31

申请号：US12890476

申请日：2010-09-24

申请人： Steven James HEINRICH , Alexander L. Minkin , Brett W. Coon , Rajeshwaran Selvanesan , Robert Steven Glanville , Charles McCarver , Anjana Rajendran , Stewart Glenn Carlton , John R. Nickolls , Brian Fahs

发明人： Steven James HEINRICH , Alexander L. Minkin , Brett W. Coon , Rajeshwaran Selvanesan , Robert Steven Glanville , Charles McCarver , Anjana Rajendran , Stewart Glenn Carlton , John R. Nickolls , Brian Fahs

IPC分类号： G06F12/08 , G06F12/00

CPC分类号： G06F12/0842 , G06F12/0897

摘要： A method for managing a parallel cache hierarchy in a processing unit. The method including receiving an instruction that includes a cache operations modifier that identifies a level of the parallel cache hierarchy in which to cache data associated with the instruction; and implementing a cache replacement policy based on the cache operations modifier.

摘要翻译： 一种用于在处理单元中管理并行高速缓存层级的方法。该方法包括接收包括高速缓存操作修饰符的指令，该缓存操作修饰符标识其中要缓存与指令相关联的数据的并行高速缓存层级的级别; 并基于高速缓存操作修饰符实现高速缓存替换策略。

6.

发明授权
Method and system to analyze inlined functions 失效
标题翻译：分析内联函数的方法和系统

公开(公告)号：US07360207B2

公开(公告)日：2008-04-15

申请号：US10016949

申请日：2001-12-13

申请人： Brian Fahs , Robert Hundt , Vinodha Ramasamy , Tara Krishnaswamy

发明人： Brian Fahs , Robert Hundt , Vinodha Ramasamy , Tara Krishnaswamy

IPC分类号： G06F9/45

CPC分类号： G06F11/3466 , G06F11/3476 , G06F2201/865

摘要： A method and a system for examining an inlined function using a performance analysis tool are described. An inlined function is identified in computer code. Upon identification of the inlined function, and for example in response to executing a breakpoint associated with the inlined function, a performance analysis tool is used to perform desired task on the inlined function.

摘要翻译： 描述了使用性能分析工具检查内联功能的方法和系统。在计算机代码中标识了内联函数。在识别内联函数时，并且例如响应于执行与内联函数相关联的断点，使用性能分析工具来对内联函数执行期望的任务。

7.

发明授权
Systems and methods for voting among parallel threads 有权
标题翻译：并行线程中投票的系统和方法

公开(公告)号：US08200947B1

公开(公告)日：2012-06-12

申请号：US12054322

申请日：2008-03-24

申请人： John R. Nickolls , Lars Nyland , Peter C. Mills , Jeremy Sugerman , Timothy Foley , Brian Fahs , Michael Garland , David P. Luebke

发明人： John R. Nickolls , Lars Nyland , Peter C. Mills , Jeremy Sugerman , Timothy Foley , Brian Fahs , Michael Garland , David P. Luebke

IPC分类号： G06F9/00

CPC分类号： G06F9/3851 , G06F9/30087 , G06F9/3009 , G06F9/3887

摘要： One embodiment of the present invention sets forth a technique for efficiently performing voting operations within a multi-threaded parallel-processing system. A group of related parallel program threads executes within a processor core together in parallel. A new instruction, called a “vote” instruction, is introduced that enables a parallel program thread to post an individual vote within the context of the group of related threads and to receive the result of the vote. In this fashion, the vote instruction advantageously reduces overhead associated with inter-thread communication, thereby improving overall system performance.

摘要翻译： 本发明的一个实施例提出了一种用于在多线程并行处理系统内有效执行投票操作的技术。一组相关的并行程序线程并行执行在处理器内核中。引入了一项称为“投票”指令的新指令，使得并行程序线程能够在相关线程组的上下文中发布个人投票并接收投票结果。以这种方式，投票指令有利地减少与线程间通信相关联的开销，从而提高整体系统性能。

8.

发明授权
Systems and methods for voting among parallel threads 有权

公开(公告)号：US08214625B1

公开(公告)日：2012-07-03

申请号：US12324645

申请日：2008-11-26

申请人： John R. Nickolls , Lars Nyland , Peter C. Mills , Jeremy Sugerman , Timothy Foley , Brian Fahs , Michael Garland , David P. Luebke

发明人： John R. Nickolls , Lars Nyland , Peter C. Mills , Jeremy Sugerman , Timothy Foley , Brian Fahs , Michael Garland , David P. Luebke

IPC分类号： G06F9/00

CPC分类号： G06F9/3851 , G06F9/30087 , G06F9/3009 , G06F9/3887

摘要： One embodiment of the present invention sets forth a technique for efficiently performing voting operations within a multi-threaded parallel-processing system. A group of related parallel program threads executes within a processor core together in parallel. A new instruction, called a “vote” instruction, is introduced that enables a parallel program thread to post an individual vote within the context of the group of related threads and to receive the result of the vote. In this fashion, the vote instruction advantageously reduces overhead associated with inter-thread communication, thereby improving overall system performance.

9.

发明申请
Architecture and Instructions for Accessing Multi-Dimensional Formatted Surface Memory 有权
标题翻译：用于访问多维格式化表面存储器的体系结构和说明

公开(公告)号：US20110074802A1

公开(公告)日：2011-03-31

申请号：US12890171

申请日：2010-09-24

申请人： John R. Nickolls , Brian Fahs , Lars Nyland , John Erik Lindholm , Richard Craig Johnson

发明人： John R. Nickolls , Brian Fahs , Lars Nyland , John Erik Lindholm , Richard Craig Johnson

IPC分类号： G06F12/00

CPC分类号： G06T1/60

摘要： One embodiment of the present invention sets forth a technique for a program to access multi-dimensional formatted graphics surface memory. Multi-dimensional memory objects called “surfaces” stored in a user-specified data or pixel format and arranged in a graphics optimized layout are accessed by programs using surface instructions. A set of memory access instructions e.g., load, store, reduce, and atomic, referred to as surface instructions, may be used to access the surfaces. Coordinate bounds checking is performed with configurable clamping. Caching behavior may also be specified by the surface instructions. Data format conversion and packing to a specified storage format is supported for store, reduction, and atomic surface instructions. Data format conversion and unpacking from a specified storage format is supported for loads and atomic surface instructions.

摘要翻译： 本发明的一个实施例提出了一种用于访问多维格式化图形表面存储器的程序的技术。称为“表面”的多维存储器对象以用户指定的数据或像素格式存储并以图形优化的布局布置，由使用表面指令的程序访问。可以使用一组存储器访问指令，例如加载，存储，减少和原子，称为表面指令，以访问表面。通过可配置的夹紧进行坐标界限检查。缓存行为也可以由表面指令指定。支持存储，缩小和原子表面指令的数据格式转换和打包到指定的存储格式。负载和原子表面指令支持从指定的存储格式进行数据格式转换和解包。

10.

发明申请
SYSTEMS AND METHODS FOR VOTING AMONG PARALLEL THREADS 审中-公开
标题翻译：用于表示并行线程的系统和方法

公开(公告)号：US20120239909A1

公开(公告)日：2012-09-20

申请号：US13485622

申请日：2012-05-31

申请人： John R. Nickolls , Lars Nyland , Peter C. Mills , Jeremy Sugerman , Timothy Foley , Brian Fahs , Michael Garland , David P. Luebke

发明人： John R. Nickolls , Lars Nyland , Peter C. Mills , Jeremy Sugerman , Timothy Foley , Brian Fahs , Michael Garland , David P. Luebke

IPC分类号： G06F9/00

CPC分类号： G06F9/3851 , G06F9/30087 , G06F9/3009 , G06F9/3887

摘要： One embodiment of the present invention sets forth a technique for efficiently performing voting operations within a multi-threaded parallel-processing system. A group of related parallel program threads executes within a processor core together in parallel. A new instruction, called a “vote” instruction, is introduced that enables a parallel program thread to post an individual vote within the context of the group of related threads and to receive the result of the vote. In this fashion, the vote instruction advantageously reduces overhead associated with inter-thread communication, thereby improving overall system performance.

摘要翻译： 本发明的一个实施例提出了一种用于在多线程并行处理系统内有效执行投票操作的技术。一组相关的并行程序线程并行执行在处理器内核中。引入了一项称为“投票”指令的新指令，使得并行程序线程能够在相关线程组的上下文中发布个人投票并接收投票结果。以这种方式，投票指令有利地减少与线程间通信相关联的开销，从而提高整体系统性能。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类