专利检索 ap:("Brucek Khailany" OR "William James Dally" OR "Ujval J. Kapasi" OR "Jim Jian Lin" OR "Raghunath Rao" OR "DeForest Tovey" OR "Mark Rygh" OR "Jung-Ho Ahn") AND inv:"William James Dally" 第 1 页

1.

发明申请
DATA EXCHANGE AND COMMUNICATION BETWEEN EXECUTION UNITS IN A PARALLEL PROCESSOR 有权
标题翻译：并行处理器中执行单位之间的数据交换和通信

公开(公告)号：US20120011349A1

公开(公告)日：2012-01-12

申请号：US13237646

申请日：2011-09-20

申请人： Brucek Khailany , William James Dally , Ujval J. Kapasi , Jim Jian Lin , Raghunath Rao , DeForest Tovey , Mark Rygh , Jung-Ho Ahn

发明人： Brucek Khailany , William James Dally , Ujval J. Kapasi , Jim Jian Lin , Raghunath Rao , DeForest Tovey , Mark Rygh , Jung-Ho Ahn

IPC分类号： G06F9/30

CPC分类号： G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/3891

摘要： Disclosed are methods and systems for dynamically determining data-transfer paths. The data-transfer pats are determined in response to an instruction that facilitates data transfer among execution lanes in an integrated-circuit processing device operable to execute operations in parallel.

摘要翻译： 公开了用于动态地确定数据传送路径的方法和系统。响应于促进可并行执行操作的集成电路处理装置中的执行通道之间的数据传送的指令来确定数据传送拍。

2.

发明申请
Data-Parallel processing unit 有权
标题翻译：数据并行处理单元

公开(公告)号：US20080140994A1

公开(公告)日：2008-06-12

申请号：US11973887

申请日：2007-10-09

申请人： Brucek Khailany , William James Dally , Ujval J. Kapasi , Jim Jian Lin , Raghunath Rao , DeForest Tovey , Mark Rygh , Jung-Ho Ahn

发明人： Brucek Khailany , William James Dally , Ujval J. Kapasi , Jim Jian Lin , Raghunath Rao , DeForest Tovey , Mark Rygh , Jung-Ho Ahn

IPC分类号： G06F9/30 , G06F9/302

CPC分类号： G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/3891

摘要： A method of operation within an integrated-circuit processing device having a plurality of execution lanes. Upon receiving an instruction to exchange data between the execution lanes, respective requests from the execution lanes are examined to determine a set of the execution lanes that may send data to one or more others of the execution lanes during a first interval. Each execution lane within the set of the execution lanes is signaled to indicate that the execution lane may send data to the one or others of the execution lanes.

摘要翻译： 一种具有多个执行通道的集成电路处理装置内的操作方法。在接收到在执行通道之间交换数据的指令时，检查来自执行通道的相应请求，以确定在第一间隔期间可以将数据发送到执行通道的一个或多个其他执行通道的集合。用信号通知执行通道集合内的每个执行通道，以指示执行通道可以向执行通道中的一个或其他执行通道发送数据。

3.

发明授权
Data exchange and communication between execution units in a parallel processor 有权
标题翻译：并行处理器中执行单元之间的数据交换和通信

公开(公告)号：US08412917B2

公开(公告)日：2013-04-02

申请号：US13237646

申请日：2011-09-20

申请人： Brucek Khailany , William James Dally , Ujval J. Kapasi , Jim Jian Lin

发明人： Brucek Khailany , William James Dally , Ujval J. Kapasi , Jim Jian Lin

IPC分类号： G06F9/00

CPC分类号： G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/3891

摘要： Disclosed are methods and systems for dynamically determining data-transfer paths. The data-transfer paths are dynamically determined in response to an instruction that facilitates data transfer among execution lanes in an integrated-circuit processing device operable to execute operations in parallel. In addition, embodiments include an integrated-circuit processing device operable to execute operations in parallel, including the capability of providing confirmation information to potential source lanes, the confirmation information indicating whether the potential source lanes may send data to requested destination lanes during a data-transfer interval.

摘要翻译： 公开了用于动态地确定数据传送路径的方法和系统。响应于便于执行并行操作的集成电路处理装置中的执行通道之间的数据传送的指令来动态地确定数据传送路径。此外，实施例包括可并行执行操作的集成电路处理设备，包括向潜在源通道提供确认信息的能力，所述确认信息指示潜在源通道是否可以在数据通路期间向所请求的目的地通道发送数据，传输间隔。

4.

发明授权
Data exchange and communication between execution units in a parallel processor 有权
标题翻译：并行处理器中执行单元之间的数据交换和通信

公开(公告)号：US08024553B2

公开(公告)日：2011-09-20

申请号：US12192813

申请日：2008-08-15

申请人： Brucek Khailany , William James Dally , Ujval J. Kapasi , Jim Jian Lin

发明人： Brucek Khailany , William James Dally , Ujval J. Kapasi , Jim Jian Lin

IPC分类号： G06F9/44 , G06F15/76

CPC分类号： G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/3891

摘要： A method of operation within an integrated-circuit processing device having a plurality of execution lanes. Upon receiving an instruction to exchange data between the execution lanes, respective requests from the execution lanes are examined to determine a set of the execution lanes that may send data to one or more others of the execution lanes during a first interval. Each execution lane within the set of the execution lanes is signaled to indicate that the execution lane may send data to the one or others of the execution lanes.

摘要翻译： 一种具有多个执行通道的集成电路处理装置内的操作方法。在接收到在执行通道之间交换数据的指令时，检查来自执行通道的相应请求，以确定在第一间隔期间可以将数据发送到执行通道的一个或多个其他执行通道的集合。用信号通知执行通道集合内的每个执行通道，以指示执行通道可以向执行通道中的一个或其他执行通道发送数据。

5.

发明申请
DATA EXCHANGE AND COMMUNICATION BETWEEN EXECUTION UNITS IN A PARALLEL PROCESSOR 有权
标题翻译：并行处理器中执行单位之间的数据交换和通信

公开(公告)号：US20080307207A1

公开(公告)日：2008-12-11

申请号：US12192813

申请日：2008-08-15

申请人： Brucek Khailany , William James Dally , Ujval J. Kapasi , Jim Jian Lin

发明人： Brucek Khailany , William James Dally , Ujval J. Kapasi , Jim Jian Lin

IPC分类号： G06F9/30

CPC分类号： G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/3891

摘要： A method of operation within an integrated-circuit processing device having a plurality of execution lanes. Upon receiving an instruction to exchange data between the execution lanes, respective requests from the execution lanes are examined to determine a set of the execution lanes that may send data to one or more others of the execution lanes during a first interval. Each execution lane within the set of the execution lanes is signaled to indicate that the execution lane may send data to the one or others of the execution lanes.

摘要翻译： 一种具有多个执行通道的集成电路处理装置内的操作方法。在接收到在执行通道之间交换数据的指令时，检查来自执行通道的相应请求，以确定在第一间隔期间可以将数据发送到执行通道的一个或多个其他执行通道的集合。用信号通知执行通道集合内的每个执行通道，以指示执行通道可以向执行通道中的一个或其他执行通道发送数据。

6.

发明授权
Processor with enhanced combined-arithmetic capability 有权
标题翻译：具有增强的组合算术能力的处理器

公开(公告)号：US08122078B2

公开(公告)日：2012-02-21

申请号：US11973887

申请日：2007-10-09

申请人： Brucek Khailany , William James Dally , Raghunath Rao , DeForest Tovey

发明人： Brucek Khailany , William James Dally , Raghunath Rao , DeForest Tovey

IPC分类号： G06F15/00

CPC分类号： G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/30043 , G06F9/3891

摘要： A method of operation within an integrated-circuit processing device having an enhanced combined-arithmetic capability. In response to an instruction indicating a combined arithmetic operation, the processor generates a dot-product of first and second operands, adds the dot-product to an accumulated value, and then outputs the sum of the accumulated value and the dot-product.

摘要翻译： 具有增强的组合算术能力的集成电路处理装置内的操作方法。响应于指示组合算术运算的指令，处理器产生第一和第二操作数的点乘积，将点积加到累加值，然后输出累积值和点积的和。

7.

发明授权
Unified streaming multiprocessor memory 有权
标题翻译：统一流式多处理器内存

公开(公告)号：US09069664B2

公开(公告)日：2015-06-30

申请号：US13240366

申请日：2011-09-22

申请人： William James Dally

发明人： William James Dally

IPC分类号： G06F13/00 , G06F12/06 , G06F13/16

CPC分类号： G06F12/06 , G06F13/16 , G06F13/1605 , G06F2213/0038

摘要： One embodiment of the present invention sets forth a technique for providing a unified memory for access by execution threads in a processing system. Several logically separate memories are combined into a single unified memory that includes a single set of shared memory banks, an allocation of space in each bank across the logical memories, a mapping rule that maps the address space of each logical memory to its partition of the shared physical memory, a circuitry including switches and multiplexers that supports the mapping, and an arbitration scheme that allocates access to the banks.

摘要翻译： 本发明的一个实施例提出了一种用于在处理系统中提供用于由执行线程访问的统一存储器的技术。几个逻辑上分离的存储器被组合成单个统一存储器，其包括单个共享存储器组集合，跨越逻辑存储器的每个存储体中的空间分配;将每个逻辑存储器的地址空间映射到其分区的映射规则共享物理存储器，包括支持映射的交换机和多路复用器的电路以及分配对存储体的访问的仲裁方案。

8.

发明授权
Hierarchical memory addressing 有权
标题翻译：分层存储器寻址

公开(公告)号：US08982140B2

公开(公告)日：2015-03-17

申请号：US13241745

申请日：2011-09-23

申请人： William James Dally

发明人： William James Dally

IPC分类号： G06F13/28 , G06F15/16 , G06F12/02 , G06F12/08

CPC分类号： G06F12/0284 , G06F12/08 , G06F12/0811 , G06F2212/251 , G06F2212/2515 , G06F2212/253 , G06F2212/302 , G06F2213/0038

摘要： One embodiment of the present invention sets forth a technique for addressing data in a hierarchical graphics processing unit cluster. A hierarchical address is constructed based on the location of a storage circuit where a target unit of data resides. The hierarchical address comprises a level field indicating a hierarchical level for the unit of data and a node identifier that indicates which GPU within the GPU cluster currently stores the unit of data. The hierarchical address may further comprise one or more identifiers that indicate which storage circuit in a particular hierarchical level currently stores the unit of data. The hierarchical address is constructed and interpreted based on the level field. The technique advantageously enables programs executing within the GPU cluster to efficiently access data residing in other GPUs using the hierarchical address.

摘要翻译： 本发明的一个实施例提出了一种用于在分层图形处理单元簇中寻址数据的技术。基于目标数据单元所在的存储电路的位置构建分层地址。分层地址包括指示数据单元的层次级别的级别字段和指示GPU簇内的GPU当前存储数据单元的节点标识符。分层地址还可以包括一个或多个标识符，其指示特定层级中的哪个存储电路当前存储数据单元。层次结构地址是基于层次域构建和解释的。该技术有利地使得在GPU集群内执行的程序能够使用分层地址高效地访问驻留在其它GPU中的数据。

9.

发明授权
Two-level scheduler for multi-threaded processing 有权
标题翻译：用于多线程处理的两级调度器

公开(公告)号：US08732711B2

公开(公告)日：2014-05-20

申请号：US13151094

申请日：2011-06-01

申请人： William James Dally , Stephen William Keckler , David Tarjan , John Erik Lindholm , Mark Alan Gebhart , Daniel Robert Johnson

发明人： William James Dally , Stephen William Keckler , David Tarjan , John Erik Lindholm , Mark Alan Gebhart , Daniel Robert Johnson

IPC分类号： G06F9/46

CPC分类号： G06F9/4881 , G06F9/3851 , G06F9/3887

摘要： One embodiment of the present invention sets forth a technique for scheduling thread execution in a multi-threaded processing environment. A two-level scheduler maintains a small set of active threads called strands to hide function unit pipeline latency and local memory access latency. The strands are a sub-set of a larger set of pending threads that is also maintained by the two-leveler scheduler. Pending threads are promoted to strands and strands are demoted to pending threads based on latency characteristics. The two-level scheduler selects strands for execution based on strand state. The longer latency of the pending threads is hidden by selecting strands for execution. When the latency for a pending thread is expired, the pending thread may be promoted to a strand and begin (or resume) execution. When a strand encounters a latency event, the strand may be demoted to a pending thread while the latency is incurred.

摘要翻译： 本发明的一个实施例提出了一种用于在多线程处理环境中调度线程执行的技术。一个两级调度程序维护一组称为线索的活动线程，以隐藏功能单元流水线延迟和本地存储器访问延迟。这些链是一组更大的待处理线程的子集，其也由二级调度器维护。等待线程被提升为线索，并且基于延迟特性将线降级到等待线程。两级调度器基于线状态来选择用于执行的线。通过选择要执行的链来隐藏待处理线程的延迟更长。当待处理线程的等待时间到期时，挂起的线程可以被提升为一个线并开始（或恢复）执行。当一条线遇到一个延迟事件时，该链可以被降级到等待线程，同时发生延迟。

10.

发明申请
Hierarchical Memory Addressing 有权
标题翻译：分层内存寻址

公开(公告)号：US20120075319A1

公开(公告)日：2012-03-29

申请号：US13241745

申请日：2011-09-23

申请人： William James Dally

发明人： William James Dally

IPC分类号： G06F13/00 , G06F12/06

CPC分类号： G06F12/0284 , G06F12/08 , G06F12/0811 , G06F2212/251 , G06F2212/2515 , G06F2212/253 , G06F2212/302 , G06F2213/0038

摘要： One embodiment of the present invention sets forth a technique for addressing data in a hierarchical graphics processing unit cluster. A hierarchical address is constructed based on the location of a storage circuit where a target unit of data resides. The hierarchical address comprises a level field indicating a hierarchical level for the unit of data and a node identifier that indicates which GPU within the GPU cluster currently stores the unit of data. The hierarchical address may further comprise one or more identifiers that indicate which storage circuit in a particular hierarchical level currently stores the unit of data. The hierarchical address is constructed and interpreted based on the level field. The technique advantageously enables programs executing within the GPU cluster to efficiently access data residing in other GPUs using the hierarchical address.

摘要翻译： 本发明的一个实施例提出了一种用于在分层图形处理单元簇中寻址数据的技术。基于目标数据单元所在的存储电路的位置构建分层地址。分层地址包括指示数据单元的层次级别的级别字段和指示GPU簇内的GPU当前存储数据单元的节点标识符。分层地址还可以包括一个或多个标识符，其指示特定层级中的哪个存储电路当前存储数据单元。层次结构地址是基于层次域构建和解释的。该技术有利地使得在GPU集群内执行的程序能够使用分层地址高效地访问驻留在其它GPU中的数据。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类