专利检索 ap:("Samuel H. Duncan" OR "Lacky V. Shah" OR "Sean J. Treichler" OR "Daniel Elliot Wexler" OR "Jerome F. Duluk, Jr." OR "Philip Browning Johnson" OR "Jonathon Stuart Ramsay Evans") AND inv:"Samuel H. Duncan" 第 1 页

1.

发明授权
Concurrent execution of independent streams in multi-channel time slice groups 有权
标题翻译：在多通道时间片组中并发执行独立流

公开(公告)号：US09442759B2

公开(公告)日：2016-09-13

申请号：US13316334

申请日：2011-12-09

申请人： Samuel H. Duncan , Lacky V. Shah , Sean J. Treichler , Daniel Elliot Wexler , Jerome F. Duluk, Jr. , Philip Browning Johnson , Jonathon Stuart Ramsay Evans

发明人： Samuel H. Duncan , Lacky V. Shah , Sean J. Treichler , Daniel Elliot Wexler , Jerome F. Duluk, Jr. , Philip Browning Johnson , Jonathon Stuart Ramsay Evans

IPC分类号： G06F9/46 , G06F13/00 , G06F9/48 , G06F9/38

CPC分类号： G06F9/4881 , G06F9/3851 , G06F9/461

摘要： A time slice group (TSG) is a grouping of different streams of work (referred to herein as “channels”) that share the same context information. The set of channels belonging to a TSG are processed in a pre-determined order. However, when a channel stalls while processing, the next channel with independent work can be switched to fully load the parallel processing unit. Importantly, because each channel in the TSG shares the same context information, a context switch operation is not needed when the processing of a particular channel in the TSG stops and the processing of a next channel in the TSG begins. Therefore, multiple independent streams of work are allowed to run concurrently within a single context increasing utilization of parallel processing units.

摘要翻译： 时间片组（TSG）是共享相同上下文信息的不同工作流（本文称为“信道”）的分组。属于TSG的信道集合以预定的顺序被处理。然而，当通道在处理过程中停顿时，可以切换具有独立工作的下一个通道来完全加载并行处理单元。重要的是，由于TSG中的每个信道共享相同的上下文信息，当TSG中的特定信道的处理停止并且TSG中的下一个信道的处理开始时，不需要上下文切换操作。因此，允许多个独立的工作流在单个上下文中同时运行，增加并行处理单元的利用率。

2.

发明申请
COMPUTE THREAD ARRAY GRANULARITY EXECUTION PREEMPTION 审中-公开
标题翻译：计算机螺旋桨阵列精度执行预警

公开(公告)号：US20130132711A1

公开(公告)日：2013-05-23

申请号：US13302962

申请日：2011-11-22

申请人： Lacky V. SHAH , Gregory Scott Palmer , Gernot Schaufler , Samuel H. Duncan , Philip Browning Johnson , Shirish Gadre , Timothy John Purcell

发明人： Lacky V. SHAH , Gregory Scott Palmer , Gernot Schaufler , Samuel H. Duncan , Philip Browning Johnson , Shirish Gadre , Timothy John Purcell

IPC分类号： G06F9/38

CPC分类号： G06F9/461

摘要： One embodiment of the present invention sets forth a technique instruction level and compute thread array granularity execution preemption. Preempting at the instruction level does not require any draining of the processing pipeline. No new instructions are issued and the context state is unloaded from the processing pipeline. When preemption is performed at a compute thread array boundary, the amount of context state to be stored is reduced because execution units within the processing pipeline complete execution of in-flight instructions and become idle. If, the amount of time needed to complete execution of the in-flight instructions exceeds a threshold, then the preemption may dynamically change to be performed at the instruction level instead of at compute thread array granularity.

摘要翻译： 本发明的一个实施例阐述了技术指令级别和计算线程数组粒度执行抢占。在指令级别抢占不需要处理管道的任何排水。不会发出新的指令，并且从处理流水线中卸载上下文状态。当在计算线程数组边界执行抢占时，由于处理流程内的执行单元完成飞行中指令的执行并变为空闲，因此减少了要存储的上下文状态量。如果完成执行飞行中指令所需的时间超过阈值，则抢占可以动态地改变以在指令级别而不是以计算线程数组粒度来执行。

3.

发明申请
INSTRUCTION LEVEL EXECUTION PREEMPTION 审中-公开
标题翻译：指导级执行预防

公开(公告)号：US20130124838A1

公开(公告)日：2013-05-16

申请号：US13294045

申请日：2011-11-10

申请人： Lacky V. SHAH , Gregory Scott Palmer , Gernot Schaufler , Samuel H. Duncan , Philip Browning Johnson , Shirish Gadre , Robert Ohannessian , Nicholas Wang , Christopher Lamb , Philip Alexander Cuadra , Timothy John Purcell

发明人： Lacky V. SHAH , Gregory Scott Palmer , Gernot Schaufler , Samuel H. Duncan , Philip Browning Johnson , Shirish Gadre , Robert Ohannessian , Nicholas Wang , Christopher Lamb , Philip Alexander Cuadra , Timothy John Purcell

IPC分类号： G06F9/38

CPC分类号： G06F9/461

摘要： One embodiment of the present invention sets forth a technique instruction level and compute thread array granularity execution preemption. Preempting at the instruction level does not require any draining of the processing pipeline. No new instructions are issued and the context state is unloaded from the processing pipeline. When preemption is performed at a compute thread array boundary, the amount of context state to be stored is reduced because execution units within the processing pipeline complete execution of in-flight instructions and become idle. If, the amount of time needed to complete execution of the in-flight instructions exceeds a threshold, then the preemption may dynamically change to be performed at the instruction level instead of at compute thread array granularity.

摘要翻译： 本发明的一个实施例阐述了技术指令级别和计算线程数组粒度执行抢占。在指令级别抢占不需要处理管道的任何排水。不会发出新的指令，并且从处理流水线中卸载上下文状态。当在计算线程数组边界执行抢占时，由于处理流程内的执行单元完成飞行中指令的执行并变为空闲，因此减少了要存储的上下文状态量。如果完成执行飞行中指令所需的时间超过阈值，则抢占可以动态地改变以在指令级别而不是以计算线程数组粒度来执行。

4.

发明授权
Asynchronous interface for communicating between clock domains 有权
标题翻译：用于在时钟域之间通信的异步接口

公开(公告)号：US08547993B1

公开(公告)日：2013-10-01

申请号：US11463682

申请日：2006-08-10

申请人： Lincoln G. Garlick , Richard A. Silkebakken , Prakash G. Apte , Paolo E. Sabella , Samuel H. Duncan , Dennis K. Ma , Sean J. Treichler

发明人： Lincoln G. Garlick , Richard A. Silkebakken , Prakash G. Apte , Paolo E. Sabella , Samuel H. Duncan , Dennis K. Ma , Sean J. Treichler

IPC分类号： H04L12/66 , H04L29/06

CPC分类号： H04L29/06 , G06F13/4226

摘要： Methods, apparatuses, and systems are presented for performing asynchronous communications involving using an asynchronous interface to send signals between a source device and a plurality of client devices, the source device and the plurality of client devices being part of a processing unit capable of performing graphics operations, the source device being coupled to the plurality of client devices using the asynchronous interface, wherein the asynchronous interface includes at least one request signal, at least one address signal, at least one acknowledge signal, and at least one data signal, and wherein the asynchronous interface operates in accordance with at least one programmable timing characteristic associated with the source device.

摘要翻译： 呈现用于执行涉及使用异步接口在源设备和多个客户端设备之间发送信号的异步通信的方法，设备和系统，源设备和多个客户端设备是能够执行图形的处理单元的一部分所述源设备使用所述异步接口耦合到所述多个客户端设备，其中所述异步接口包括至少一个请求信号，至少一个地址信号，至少一个确认信号和至少一个数据信号，并且其中异步接口根据与源设备相关联的至少一个可编程定时特性进行操作。

5.

发明授权
Providing byte enables for peer-to-peer data transfer within a computing environment 有权
标题翻译：提供字节可以在计算环境中进行对等数据传输

公开(公告)号：US09424227B2

公开(公告)日：2016-08-23

申请号：US13541633

申请日：2012-07-03

申请人： Samuel H. Duncan , Dennis K. Ma , Wei-Je Huang , Gary Ward

发明人： Samuel H. Duncan , Dennis K. Ma , Wei-Je Huang , Gary Ward

IPC分类号： G06F15/173 , G06F15/167 , G06F15/16 , G06F15/163 , G06F15/17

CPC分类号： G06F15/167 , G06F15/16 , G06F15/163 , G06F15/17 , G06F15/173

摘要： Non-contiguous or tiled payload data are efficiently transferred between peers over a fabric. Specifically, a client transfers a byte enable message to a peer device via a mailbox mechanism, where the byte enable message specifies which bytes of the payload data being transferred via the data packet are to be written to the frame buffer on the peer device and which bytes are not to be written. The client transfers the non-contiguous or tiled payload payload data to the peer device. Upon receiving the payload data, the peer device writes bytes from the payload data into the target frame buffer for only those bytes enabled via the byte enable message. One advantage of the present invention is that non-contiguous or tiled data are transferred over a fabric with improved efficiency.

摘要翻译： 非连续或平铺的有效载荷数据可以通过一个结构在对等体之间有效传输。具体来说，客户端通过邮箱机制向对等设备传送字节使能消息，其中字节使能消息指定要经由数据包传送的有效载荷数据的哪些字节将被写入对等设备上的帧缓冲器，以及哪个字节不被写入。客户端将非连续或平铺的有效载荷有效载荷数据传输到对等设备。在接收到有效载荷数据时，对等设备将字节从有效载荷数据写入目标帧缓冲器，以便仅通过字节使能消息使能的字节。本发明的一个优点在于，不连续的或平铺的数据以提高的效率在织物上传送。

6.

发明授权
Method and apparatus for providing peer-to-peer data transfer within a computing environment 有权
标题翻译：用于在计算环境内提供对等数据传输的方法和装置

公开(公告)号：US07451259B2

公开(公告)日：2008-11-11

申请号：US11005451

申请日：2004-12-06

申请人： Samuel H. Duncan , Wei-Je Huang , John H. Edmondson

发明人： Samuel H. Duncan , Wei-Je Huang , John H. Edmondson

IPC分类号： G06F13/36 , G06F13/00

CPC分类号： H04L49/901 , H04L49/90 , H04L49/9073

摘要： A method and apparatus for providing peer-to-peer data transfer through an interconnecting fabric. The method and apparatus enable a first device to read and/or write data to/from a local memory of a second device by communicating read and write requests across the interconnectivity fabric. Such data transfer can be performed even when the communication protocol of the interconnectivity fabric does not permit such transfers.

摘要翻译： 一种用于通过互连结构提供对等数据传输的方法和装置。该方法和装置使得第一设备能够通过在互连结构上传递读取和写入请求来向/从第二设备的本地存储器读取和/或写入数据。即使当互连结构的通信协议不允许这样的传输时，也可以执行这种数据传送。

7.

发明授权
Method and apparatus for providing peer-to-peer data transfer within a computing environment 有权
标题翻译：用于在计算环境内提供对等数据传输的方法和装置

公开(公告)号：US07275123B2

公开(公告)日：2007-09-25

申请号：US11005150

申请日：2004-12-06

申请人： Samuel H. Duncan , Wei-Je Huang , John H. Edmondson

发明人： Samuel H. Duncan , Wei-Je Huang , John H. Edmondson

IPC分类号： G06F13/36 , G06F13/00

CPC分类号： H04L12/66

摘要： A method and apparatus for providing peer-to-peer data transfer through an interconnecting fabric. The method and apparatus enable a first device to read and/or write data to/from a local memory of a second device by communicating read and write requests across the interconnectivity fabric. Such data transfer can be performed even when the communication protocol of the interconnectivity fabric does not permit such transfers.

摘要翻译： 一种用于通过互连结构提供对等数据传输的方法和装置。该方法和装置使得第一设备能够通过在互连结构上传递读取和写入请求来向/从第二设备的本地存储器读取和/或写入数据。即使当互连结构的通信协议不允许这样的传输时，也可以执行这种数据传送。

8.

发明授权
Passive release avoidance technique 失效

公开(公告)号：US07024509B2

公开(公告)日：2006-04-04

申请号：US09944515

申请日：2001-08-31

申请人： Samuel H. Duncan , Steven Ho

发明人： Samuel H. Duncan , Steven Ho

IPC分类号： G06F13/36

CPC分类号： H03K5/19 , G06F13/24 , G06F13/4081 , G06F2213/2402

摘要： A system and method avoids passive release of interrupts in a computer system. The computer system includes a plurality of processors, a plurality of input/output (I/O) devices each capable of issuing interrupts, and an I/O bridge interfacing between the I/O devices and the processors. Interrupts, such as level sensitive interrupts (LSIs), asserted by an I/O device coupled to a specific port of the I/O bridge are sent to a processor for servicing by an interrupt controller, which also sets an interrupt pending flag. Upon dispatching the respective interrupt service routine, the processor generates two ordered messages. The first ordered message is sent to the I/O device that triggered the interrupt, informing it that the interrupt has been serviced. The second ordered message directs the interrupt controller to clear the respective interrupt pending flag. Both messages are sent, in order, to the particular I/O bridge port to which the subject I/O device is coupled. After forwarding the first message to the I/O device, the bridge port forwards the second message to the interrupt controller so that the interrupt can be deasserted before the interrupt pending flag is cleared.

9.

发明授权
Scalable efficient I/O port protocol 有权
标题翻译：可扩展的高效I / O端口协议

公开(公告)号：US06738836B1

公开(公告)日：2004-05-18

申请号：US09652391

申请日：2000-08-31

申请人： Richard E. Kessler , Samuel H. Duncan , David W. Hartwell , David A. J. Webb, Jr. , Steve Lang

发明人： Richard E. Kessler , Samuel H. Duncan , David W. Hartwell , David A. J. Webb, Jr. , Steve Lang

IPC分类号： G06F1300

CPC分类号： G06F15/17381 , G06F12/0817 , G06F2212/621

摘要： A system that supports a high performance, scalable, and efficient I/O port protocol to connect to I/O devices is disclosed. A distributed multiprocessing computer system contains a number of processors each coupled to an I/O bridge ASIC implementing the I/O port protocol. One or more I/O devices are coupled to the I/O bridge ASIC, each I/O device capable of accessing machine resources in the computer system by transmitting and receiving message packets. Machine resources in the computer system include data blocks, registers and interrupt queues. Each processor in the computer system is coupled to a memory module capable of storing data blocks shared between the processors. Coherence of the shared data blocks in this shared memory system is maintained using a directory based coherence protocol. Coherence of data blocks transferred during I/O device read and write accesses is maintained using the same coherence protocol as for the memory system. Data blocks transferred during an I/O device read or write access may be buffered in a cache by the I/O bridge ASIC only if the I/O bridge ASIC has exclusive copies of the data blocks. The I/O bridge ASIC includes a DMA device that supports both in-order and out-of-order DMA read and write streams of data blocks. An in-order stream of reads of data blocks performed by the DMA device always results in the DMA device receiving coherent data blocks that do not have to be written back to the memory module.

摘要翻译： 公开了一种支持高性能，可扩展和高效的I / O端口协议来连接到I / O设备的系统。分布式多处理计算机系统包含多个处理器，每个处理器都耦合到实现I / O端口协议的I / O桥ASIC。一个或多个I / O设备耦合到I / O桥ASIC，每个I / O设备能够通过发送和接收消息分组来访问计算机系统中的机器资源。计算机系统中的机器资源包括数据块，寄存器和中断队列。计算机系统中的每个处理器耦合到能够存储处理器之间共享的数据块的存储器模块。使用基于目录的一致性协议来维护该共享存储器系统中的共享数据块的一致性。使用与存储系统相同的一致性协议来维护I / O设备读写访问期间传输的数据块的一致性。只有当I / O桥ASIC具有数据块的排他副本时，I / O桥ASIC才能缓存在I / O设备读或写访问期间传输的数据块。 I / O桥ASIC包括支持数据块的顺序和无序DMA读和写数据流的DMA设备。由DMA设备执行的数据块的顺序读取流总是导致DMA设备接收不必写入存储器模块的相干数据块。

10.

发明授权
Adaptive data fetch prediction algorithm 有权
标题翻译：自适应数据提取预测算法

公开(公告)号：US06701387B1

公开(公告)日：2004-03-02

申请号：US09652644

申请日：2000-08-31

申请人： Roger Pannel , David W. Hartwell , Samuel H. Duncan , Rajen Ramchandani , Andrej Kocev , Jeffrey Willcox , Steven Ho

发明人： Roger Pannel , David W. Hartwell , Samuel H. Duncan , Rajen Ramchandani , Andrej Kocev , Jeffrey Willcox , Steven Ho

IPC分类号： G06F1300

CPC分类号： G06F12/0862 , G06F13/28

摘要： A method and apparatus for accommodating the speed requirements of a DMA read request from PCI protocol I/O devices attached via a DMA to a multiprocessor system mesh. A bridge between the device controller and the mesh is described which buffers the data from the memory in cache lines from which the data is delivered finally to the I/O device. The system is adaptive in that the number of cache lines required in past reads are remembered and used to determine if the number of cache lines is reduced or increased.

摘要翻译： 一种用于通过DMA连接到多处理器系统网格的PCI协议I / O设备来适应DMA读请求的速度要求的方法和装置。描述了设备控制器和网格之间的桥接器，其将来自数据的数据的缓存行中的数据从数据缓存到I / O设备。该系统是自适应的，因为过去读取中所需的高速缓存行的数量被记住并用于确定高速缓存行的数量是减少还是增加。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类