专利检索 ap:("Matthias A. Blumrich" OR "Paul W. Coteus" OR "Dong Chen" OR "Alan Gara" OR "Mark E. Giampapa" OR "Philip Heidelberger" OR "Dirk Hoenicke" OR "Todd E. Takken" OR "Burkhard D. Steinmacher-Burow" OR "Pavlos M. Vranas") AND inv:"Dirk Hoenicke" 第 1 页

1.

发明授权
Collective network for computer structures 有权
标题翻译：计算机结构集体网络

公开(公告)号：US08626957B2

公开(公告)日：2014-01-07

申请号：US13101566

申请日：2011-05-05

申请人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas

IPC分类号： G06F15/16

CPC分类号： H04L1/08 , G06F9/46 , G06F11/08 , G06F11/1423 , H03M13/09 , H04L1/0061 , H04L1/1607 , H04L1/1867 , H04L2001/0093 , H04L2001/0097

摘要： A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.

摘要翻译： 一种用于实现互连处理节点之间的高速，低延迟全局集体通信的系统和方法。全局集体网络最优地使得能够在具有多个互连处理节点的计算机结构中执行并行算法操作期间执行集体缩减操作。包括通过链路互连网络节点的路由器设备，以便于在虚拟网络的节点处执行低延迟全局处理操作。全局集体网络可以被配置为以异步或同步方式提供全局屏障和中断功能。当在大规模并行超级计算结构中实现时，全局集体网络根据处理算法的需要在物理上和逻辑上可分割。

2.

发明授权
Collective network for computer structures 有权
标题翻译：计算机结构集体网络

公开(公告)号：US08001280B2

公开(公告)日：2011-08-16

申请号：US11572372

申请日：2005-07-18

申请人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas

IPC分类号： G06F15/16

CPC分类号： G06F15/17381 , H04L1/1845 , H04L12/4641

摘要： A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices ate included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.

摘要翻译： 一种用于实现互连处理节点之间的高速，低延迟全局集体通信的系统和方法。全局集体网络最优地使得能够在具有多个互连处理节点的计算机结构中执行并行算法操作期间执行集体缩减操作。路由器设备包括通过链路互连网络的节点，以便于在虚拟网络和类结构的节点处执行低延迟全局处理操作。全局集体网络可以被配置为以异步或同步方式提供全局屏障和中断功能。当在大规模并行超级计算结构中实现时，全局集体网络根据处理算法的需要在物理上和逻辑上可分割。

3.

发明申请
COLLECTIVE NETWORK FOR COMPUTER STRUCTURES 有权
标题翻译：电脑结构的集体网络

公开(公告)号：US20110219280A1

公开(公告)日：2011-09-08

申请号：US13101566

申请日：2011-05-05

申请人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas

IPC分类号： H03M13/09 , H04L1/08 , G06F11/10 , G06F11/14

CPC分类号： H04L1/08 , G06F9/46 , G06F11/08 , G06F11/1423 , H03M13/09 , H04L1/0061 , H04L1/1607 , H04L1/1867 , H04L2001/0093 , H04L2001/0097

摘要： A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.

摘要翻译： 一种用于实现互连处理节点之间的高速，低延迟全局集体通信的系统和方法。全局集体网络最优地使得能够在具有多个互连处理节点的计算机结构中执行并行算法操作期间执行集体缩减操作。包括通过链路互连网络节点的路由器设备，以便于在虚拟网络和类结构的节点处执行低延迟全局处理操作。全局集体网络可以被配置为以异步或同步方式提供全局屏障和中断功能。当在大规模并行超级计算结构中实现时，全局集体网络根据处理算法的需要在物理上和逻辑上可分割。

4.

发明申请
Collective Network For Computer Structures 有权
标题翻译：计算机结构集体网

公开(公告)号：US20080104367A1

公开(公告)日：2008-05-01

申请号：US11572372

申请日：2005-07-18

申请人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas

IPC分类号： G06F15/80 , G06F9/30

CPC分类号： G06F15/17381 , H04L1/1845 , H04L12/4641

摘要： A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices ate included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.

摘要翻译： 一种用于实现互连处理节点之间的高速，低延迟全局集体通信的系统和方法。全局集体网络最优地使得能够在具有多个互连处理节点的计算机结构中执行并行算法操作期间执行集体缩减操作。路由器设备包括通过链路互连网络的节点，以便于在虚拟网络和类结构的节点处执行低延迟全局处理操作。全局集体网络可以被配置为以异步或同步方式提供全局屏障和中断功能。当在大规模并行超级计算结构中实现时，全局集体网络根据处理算法的需要在物理上和逻辑上可分割。

5.

发明授权
Global tree network for computing structures enabling global processing operations 失效
标题翻译：用于计算结构的全局树网络，实现全球处理操作

公开(公告)号：US07650434B2

公开(公告)日：2010-01-19

申请号：US10469000

申请日：2002-02-25

申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

IPC分类号： G06F15/16

CPC分类号： G06F15/17337

摘要： A system and method for enabling high-speed, low-latency global tree network communications among processing nodes interconnected according to a tree network structure. The global tree network enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the tree via links to facilitate performance of low-latency global processing operations at nodes of the virtual tree and sub-tree structures. The global operations performed include one or more of: broadcast operations downstream from a root node to leaf nodes of a virtual tree, reduction operations upstream from leaf nodes to the root node in the virtual tree, and point-to-point message passing from any node to the root node. The global tree network is configurable to provide global barrier and interrupt functionality in asynchronous or synchronized manner, and, is physically and logically partitionable.

摘要翻译： 一种用于根据树网络结构互连的处理节点之间实现高速，低延迟的全局树网络通信的系统和方法。全局树网络使得能够在具有多个互连的处理节点的计算机结构中执行并行算法操作期间执行集合缩减操作。包括通过链路互连树节点的路由器设备，以便于在虚拟树和子树结构的节点处执行低延迟全局处理操作。执行的全局操作包括以下一个或多个：从根节点向下游到虚拟树的叶节点的广播操作，从叶节点向上到叶节点到虚拟树中的根节点的减少操作，以及从任何节点到根节点。全局树网络可配置为以异步或同步方式提供全局屏障和中断功能，并且在物理和逻辑上可分区。

6.

发明授权
Low latency memory access and synchronization 失效
标题翻译：低延迟内存访问和同步

公开(公告)号：US07174434B2

公开(公告)日：2007-02-06

申请号：US10468994

申请日：2002-02-25

申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

IPC分类号： G06F12/12

CPC分类号： G06F9/52

摘要： A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

摘要翻译： 与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。多处理器中的每个处理器共享资源，并且每个共享资源在锁定设备内具有关联的锁，其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。当处理器拥有与该资源相关联的锁定时，处理器仅具有访问资源的权限，并且处理器拥有锁的尝试仅需要单个加载操作，而不是传统的原子负载后跟存储，使得处理器只执行读取操作，并且硬件锁定装置执行后续的写入操作而不是处理器。还公开了用于非连续数据结构的简单预取。重新定义存储器线，使得除了正常的物理存储器数据之外，每行包括足够大的指针以指向存储器中的任何其他行，其中指针用于确定要预取的存储器行而不是一些其它预测算法。这使得硬件能够有效地预取不连续但重复的存储器访问模式。

7.

发明授权
Method for prefetching non-contiguous data structures 失效
标题翻译：预取非连续数据结构的方法

公开(公告)号：US07529895B2

公开(公告)日：2009-05-05

申请号：US11617276

申请日：2006-12-28

申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

IPC分类号： G06F13/28

CPC分类号： G06F12/0862 , G06F9/52 , G06F2212/6028

摘要： A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple perfecting for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefect rather than some other predictive algorithm. This enables hardware to effectively prefect memory access patterns that are non-contiguous, but repetitive.

摘要翻译： 与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。多处理器中的每个处理器共享资源，并且每个共享资源在锁定设备内具有关联的锁，其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。当处理器拥有与该资源相关联的锁定时，处理器仅具有访问资源的权限，并且处理器拥有锁的尝试仅需要单个加载操作，而不是传统的原子负载后跟存储，使得处理器只执行读取操作，并且硬件锁定装置执行后续的写入操作而不是处理器。还公开了用于非连续数据结构的简单完善。存储器线被重新定义，使得除了正常的物理存储器数据之外，每行包括足够大的指针以指向存储器中的任何其他行，其中指针用于确定哪个存储器行被提供而不是一些其它预测算法。这使得硬件能够有效地预处理不连续但重复的存储器访问模式。

8.

发明授权
Low latency memory access and synchronization 失效
标题翻译：低延迟内存访问和同步

公开(公告)号：US07818514B2

公开(公告)日：2010-10-19

申请号：US12196796

申请日：2008-08-22

申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

IPC分类号： G06F12/06

CPC分类号： G06F12/0862 , G06F9/52 , G06F2212/6028

摘要： A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

摘要翻译： 与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。多处理器中的Bach处理器共享资源，并且每个共享资源在锁定设备内具有关联的锁，其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。当处理器拥有与该资源相关联的锁定时，处理器仅具有访问资源的权限，并且处理器拥有锁的尝试仅需要单个加载操作，而不是传统的原子负载后跟存储，使得处理器只执行读取操作，并且硬件锁定装置执行后续的写入操作而不是处理器。还公开了用于非连续数据结构的简单预取。重新定义存储器线，使得除了正常的物理存储器数据之外，每行包括足够大的指针以指向存储器中的任何其他行，其中指针用于确定要预取的存储器行而不是一些其它预测算法。这使得硬件能够有效地预取不连续但重复的存储器访问模式。

9.

发明申请
LOW LATENCY MEMORY ACCESS AND SYNCHRONIZATION 失效
标题翻译：低延迟存储器访问和同步

公开(公告)号：US20080313408A1

公开(公告)日：2008-12-18

申请号：US12196796

申请日：2008-08-22

申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas

IPC分类号： G06F12/08

CPC分类号： G06F12/0862 , G06F9/52 , G06F2212/6028

摘要： A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Bach processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.

摘要翻译： 与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。多处理器中的Bach处理器共享资源，并且每个共享资源在锁定设备内具有关联的锁，其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。当处理器拥有与该资源相关联的锁定时，处理器仅具有访问资源的权限，并且处理器拥有锁的尝试仅需要单个加载操作，而不是传统的原子负载后跟存储，使得处理器只执行读取操作，并且硬件锁定装置执行后续的写入操作而不是处理器。还公开了用于非连续数据结构的简单预取。重新定义存储器线，使得除了正常的物理存储器数据之外，每行包括足够大的指针以指向存储器中的任何其他行，其中指针用于确定要预取的存储器行而不是一些其它预测算法。这使得硬件能够有效地预取不连续但重复的存储器访问模式。

10.

发明授权
Multidimensional switch network 失效
标题翻译：多维交换机网络

公开(公告)号：US07486619B2

公开(公告)日：2009-02-03

申请号：US10793068

申请日：2004-03-04

申请人： Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas , Matthias Augustin Blumrich

发明人： Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas , Matthias Augustin Blumrich

IPC分类号： H04L12/28

CPC分类号： H04L49/1576 , H04L45/06

摘要： Multidimensional switch data networks are disclosed, such as are used by a distributed-memory parallel computer, as applied for example to computations in the field of life sciences. A distributed memory parallel computing system comprises a number of parallel compute nodes and a message passing data network connecting the compute nodes together. The data network connecting the compute nodes comprises a multidimensional switch data network of compute nodes having N dimensions, and a number/array of compute nodes Ln in each of the N dimensions. Each compute node includes an N port routing element having a port for each of the N dimensions. Each compute node of an array of Ln compute nodes in each of the N dimensions connects through a port of its routing element to an Ln port crossbar switch having Ln ports. Several embodiments are disclosed of a 4 dimensional computing system having 65,536 compute nodes.

摘要翻译： 公开了多维交换机数据网络，例如由分布式存储器并行计算机使用的，例如应用于生命科学领域的计算。分布式存储器并行计算系统包括多个并行计算节点和将计算节点连接在一起的消息传递数据网络。连接计算节点的数据网络包括具有N维的计算节点的多维交换机数据网络和N个维度中的每一个中的计算节点Ln的数量/数组。每个计算节点包括具有用于N个维度中的每一个的端口的N端口路由元件。每个N维中的Ln计算节点阵列的每个计算节点通过其路由元素的端口连接到具有Ln端口的Ln端口交叉开关。公开了具有65,536个计算节点的四维计算系统的几个实施例。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类