专利检索 ap:("Dong Chen" OR "Matthew R. Ellavsky" OR "Ross L. Franke" OR "Alan Gara" OR "Thomas M. Gooding" OR "Rudolf A. Haring" OR "Mark J. Jeanson" OR "Gerard V. Kopcsay" OR "Thomas A. Liebsch" OR "Daniel Littrell" OR "Martin Ohmacht" OR "Don D. Reed" OR "Brandon E. Schenck" OR "Richard A. Swetz") AND inv:"Dong Chen" 第 1 页

1.

发明授权
Global synchronization of parallel processors using clock pulse width modulation 有权
标题翻译：使用时钟脉宽调制的并行处理器的全局同步

公开(公告)号：US08412974B2

公开(公告)日：2013-04-02

申请号：US12696764

申请日：2010-01-29

申请人： Dong Chen , Matthew R. Ellavsky , Ross L. Franke , Alan Gara , Thomas M. Gooding , Rudolf A. Haring , Mark J. Jeanson , Gerard V. Kopcsay , Thomas A. Liebsch , Daniel Littrell , Martin Ohmacht , Don D. Reed , Brandon E. Schenck , Richard A. Swetz

发明人： Dong Chen , Matthew R. Ellavsky , Ross L. Franke , Alan Gara , Thomas M. Gooding , Rudolf A. Haring , Mark J. Jeanson , Gerard V. Kopcsay , Thomas A. Liebsch , Daniel Littrell , Martin Ohmacht , Don D. Reed , Brandon E. Schenck , Richard A. Swetz

IPC分类号： G06F1/04 , G06F1/12 , G06F15/16

CPC分类号： G06F1/08 , G06F1/10

摘要： A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.

摘要翻译： 电路产生具有脉冲宽度修改的全局时钟信号，以使并行计算系统中的处理器同步。电路可以包括硬件模块和时钟分离器。硬件模块可以产生时钟信号并对时钟信号进行脉冲宽度修改。脉冲宽度修改在时钟信号的时钟周期内改变脉冲宽度。时钟分配器可以将脉冲宽度修改的时钟信号分配给并行计算系统中的多个处理器。

2.

发明申请
GLOBAL SYNCHRONIZATION OF PARALLEL PROCESSORS USING CLOCK PULSE WIDTH MODULATION 有权
标题翻译：使用时钟脉冲宽度调制的并行处理器的全局同步

公开(公告)号：US20110119475A1

公开(公告)日：2011-05-19

申请号：US12696764

申请日：2010-01-29

申请人： Dong Chen , Matthew R. Ellavsky , Ross L. Franke , Alan Gara , Thomas M. Gooding , Rudolf A. Haring , Mark J. Jeanson , Gerard V. Kopcsay , Thomas A. Liebsch , Daniel Littrell , Martin Ohmacht , Don D. Reed , Brandon E. Schenck , Richard A. Swetz

发明人： Dong Chen , Matthew R. Ellavsky , Ross L. Franke , Alan Gara , Thomas M. Gooding , Rudolf A. Haring , Mark J. Jeanson , Gerard V. Kopcsay , Thomas A. Liebsch , Daniel Littrell , Martin Ohmacht , Don D. Reed , Brandon E. Schenck , Richard A. Swetz

IPC分类号： G06F1/12 , G06F1/10 , G06F9/00 , G06F1/08

CPC分类号： G06F1/08 , G06F1/10

摘要： A circuit generates a global clock signal with a pulse width modification to synchronize processors in a parallel computing system. The circuit may include a hardware module and a clock splitter. The hardware module may generate a clock signal and performs a pulse width modification on the clock signal. The pulse width modification changes a pulse width within a clock period in the clock signal. The clock splitter may distribute the pulse width modified clock signal to a plurality of processors in the parallel computing system.

摘要翻译： 电路产生具有脉冲宽度修改的全局时钟信号，以使并行计算系统中的处理器同步。电路可以包括硬件模块和时钟分离器。硬件模块可以产生时钟信号并对时钟信号进行脉冲宽度修改。脉冲宽度修改在时钟信号的时钟周期内改变脉冲宽度。时钟分配器可以将脉冲宽度修改的时钟信号分配给并行计算系统中的多个处理器。

3.

发明授权
Reproducibility in a multiprocessor system 有权
标题翻译：多处理器系统中的重现性

公开(公告)号：US08595554B2

公开(公告)日：2013-11-26

申请号：US12774475

申请日：2010-05-05

申请人： Ralph A. Bellofatto , Dong Chen , Paul W. Coteus , Noel A. Eisley , Alan Gara , Thomas M. Gooding , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Thomas A. Liebsch , Martin Ohmacht , Don D. Reed , Robert M. Senger , Burkhard Steinmacher-Burow , Yutaka Sugawara

发明人： Ralph A. Bellofatto , Dong Chen , Paul W. Coteus , Noel A. Eisley , Alan Gara , Thomas M. Gooding , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Thomas A. Liebsch , Martin Ohmacht , Don D. Reed , Robert M. Senger , Burkhard Steinmacher-Burow , Yutaka Sugawara

IPC分类号： G06F11/00

CPC分类号： G06F1/10 , G06F11/2242

摘要： Fixing a problem is usually greatly aided if the problem is reproducible. To ensure reproducibility of a multiprocessor system, the following aspects are proposed: a deterministic system start state, a single system clock, phase alignment of clocks in the system, system-wide synchronization events, reproducible execution of system components, deterministic chip interfaces, zero-impact communication with the system, precise stop of the system and a scan of the system state.

摘要翻译： 如果问题是可重现的，通常会大大帮助解决问题。为了确保多处理器系统的可重复性，提出了以下方面：确定性系统启动状态，单个系统时钟，系统中的时钟相位对齐，全系统同步事件，系统组件的可重复执行，确定性芯片接口，零 - 与系统进行通信，精确地停止系统并扫描系统状态。

4.

发明申请
REPRODUCIBILITY IN A MULTIPROCESSOR SYSTEM 有权
标题翻译：多处理器系统中的可重复性

公开(公告)号：US20110119521A1

公开(公告)日：2011-05-19

申请号：US12774475

申请日：2010-05-05

申请人： Ralph A. Bellofatto , Dong Chen , Paul W. Coteus , Noel A. Eisley , Alan Gara , Thomas M. Gooding , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Thomas A. Liebsch , Martin Ohmacht , Don D. Reed , Robert M. Senger , Burkhard Steinmacher-Burow , Yutaka Sugawara

发明人： Ralph A. Bellofatto , Dong Chen , Paul W. Coteus , Noel A. Eisley , Alan Gara , Thomas M. Gooding , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Thomas A. Liebsch , Martin Ohmacht , Don D. Reed , Robert M. Senger , Burkhard Steinmacher-Burow , Yutaka Sugawara

IPC分类号： G06F1/04

CPC分类号： G06F1/10 , G06F11/2242

摘要： Fixing a problem is usually greatly aided if the problem is reproducible. To ensure reproducibility of a multiprocessor system, the following aspects are proposed: a deterministic system start state, a single system clock, phase alignment of clocks in the system, system-wide synchronization events, reproducible execution of system components, deterministic chip interfaces, zero-impact communication with the system, precise stop of the system and a scan of the system state.

摘要翻译： 如果问题是可重现的，通常会大大帮助解决问题。为了确保多处理器系统的可重复性，提出了以下方面：确定性系统启动状态，单个系统时钟，系统中的时钟相位对齐，全系统同步事件，系统组件的可重复执行，确定性芯片接口，零 - 与系统进行通信，精确地停止系统并扫描系统状态。

5.

发明授权
Multi-petascale highly efficient parallel supercomputer 有权
标题翻译：多千兆高效并行超级计算机

公开(公告)号：US09081501B2

公开(公告)日：2015-07-14

申请号：US13004007

申请日：2011-01-10

申请人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

发明人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

IPC分类号： G06F15/173 , G06F9/06 , G06F15/76

CPC分类号： G06F13/287 , G06F9/06 , G06F9/3004 , G06F9/30047 , G06F9/3885 , G06F12/0811 , G06F12/0831 , G06F12/0862 , G06F12/0864 , G06F12/1027 , G06F15/17381 , G06F15/17387 , G06F15/76 , G06F15/8069 , G06F2212/1016 , G06F2212/602 , G06F2212/6022 , G06F2212/6024 , G06F2212/6032 , Y02D10/13 , Y02D10/14

摘要： A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

摘要翻译： 具有100 petaOPS规模计算的多Petascale高效并行超级计算机，其成本，功耗和占地面积都在降低，并且允许从互连角度来看处理节点的最大封装密度。超级计算机利用了VLSI的技术进步，实现了许多处理器可以集成到单个专用集成电路（ASIC）中的计算模型。每个ASIC计算节点包括利用集成到一个管芯中的四个或更多个处理器的片上系统ASIC，每个处理器具有对所有系统资源的完全访问，并且使得处理器能够对诸如计算或消息传递I / O 并且优选地，根据应用内的各种算法阶段实现功能的自适应分割，或者如果I / O或其他处理器未被充分利用，则可以参与计算或通信节点通过五维环面网络互连使用DMA来最大限度地最大化节点之间的分组通信的吞吐量并最小化等待时间。

6.

发明申请
MULTI-PETASCALE HIGHLY EFFICIENT PARALLEL SUPERCOMPUTER 有权
标题翻译：多层高效平行超级计算机

公开(公告)号：US20110219208A1

公开(公告)日：2011-09-08

申请号：US13004007

申请日：2011-01-10

申请人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

发明人： Sameh Asaad , Ralph E. Bellofatto , Michael A. Blocksome , Matthias A. Blumrich , Peter Boyle , Jose R. Brunheroto , Dong Chen , Chen-Yong Cher , George L. Chiu , Norman Christ , Paul W. Coteus , Kristan D. Davis , Gabor J. Dozsa , Alexandre E. Eichenberger , Noel A. Eisley , Matthew R. Ellavsky , Kahn C. Evans , Bruce M. Fleischer , Thomas W. Fox , Alan Gara , Mark E. Giampapa , Thomas M. Gooding , Michael K. Gschwind , John A. Gunnels , Shawn A. Hall , Rudolf A. Haring , Philip Heidelberger , Todd A. Inglett , Brant L. Knudson , Gerard V. Kopcsay , Sameer Kumar , Amith R. Mamidala , James A. Marcella , Mark G. Megerian , Douglas R. Miller , Samuel J. Miller , Adam J. Muff , Michael B. Mundy , John K. O'Brien , Kathryn M. O'Brien , Martin Ohmacht , Jeffrey J. Parker , Ruth J. Poole , Joseph D. Ratterman , Valentina Salapura , David L. Satterfield , Robert M. Senger , Brian Smith , Burkhard Steinmacher-Burow , William M. Stockdell , Craig B. Stunkel , Krishnan Sugavanam , Yutaka Sugawara , Todd E. Takken , Barry M. Trager , James L. Van Oosten , Charles D. Wait , Robert E. Walkup , Alfred T. Watson , Robert W. Wisniewski , Peng Wu

IPC分类号： G06F15/76 , G06F9/06

CPC分类号： G06F13/287 , G06F9/06 , G06F9/3004 , G06F9/30047 , G06F9/3885 , G06F12/0811 , G06F12/0831 , G06F12/0862 , G06F12/0864 , G06F12/1027 , G06F15/17381 , G06F15/17387 , G06F15/76 , G06F15/8069 , G06F2212/1016 , G06F2212/602 , G06F2212/6022 , G06F2212/6024 , G06F2212/6032 , Y02D10/13 , Y02D10/14

摘要： A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

摘要翻译： 具有100 petaOPS规模计算的多Petascale高效并行超级计算机，其成本，功耗和占地面积都在降低，并且允许从互连角度来看处理节点的最大封装密度。超级计算机利用了VLSI的技术进步，实现了许多处理器可以集成到单个专用集成电路（ASIC）中的计算模型。每个ASIC计算节点包括利用集成到一个管芯中的四个或更多个处理器的片上系统ASIC，每个处理器具有对所有系统资源的完全访问，并且使得处理器能够对诸如计算或消息传递I / O 并且优选地，根据应用内的各种算法阶段实现功能的自适应分割，或者如果I / O或其他处理器未被充分利用，则可以参与计算或通信节点通过五维环面网络互连使用DMA来最大限度地最大化节点之间的分组通信的吞吐量并最小化等待时间。

7.

发明申请
NON-VOLATILE MEMORY FOR CHECKPOINT STORAGE 失效
标题翻译：用于检查点存储的非易失性存储器

公开(公告)号：US20110173488A1

公开(公告)日：2011-07-14

申请号：US13004005

申请日：2011-01-10

申请人： Matthias A. Blumrich , Dong Chen , Thomas M. Cipolla , Paul W. Coteus , Alan Gara , Philip Heidelberger , Mark J. Jeanson , Gerard V. Kopcsay , Martin Ohmacht , Todd E. Takken

发明人： Matthias A. Blumrich , Dong Chen , Thomas M. Cipolla , Paul W. Coteus , Alan Gara , Philip Heidelberger , Mark J. Jeanson , Gerard V. Kopcsay , Martin Ohmacht , Todd E. Takken

IPC分类号： G06F11/00 , G06F11/14

CPC分类号： G06F11/1438 , G06F2201/82 , G06F2201/84

摘要： A system, method and computer program product for supporting system initiated checkpoints in high performance parallel computing systems and storing of checkpoint data to a non-volatile memory storage device. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity. In one embodiment, the non-volatile memory is a pluggable flash memory card.

摘要翻译： 一种用于在高性能并行计算系统中支持系统发起的检查点并将检查点数据存储到非易失性存储器存储设备的系统，方法和计算机程序产品。系统和方法产生选择性控制信号，以在存在与在节点处运行的用户应用相关联的消息传送活动的情况下执行系统相关数据的检查点。检查点由系统启动，使得即使在存在包括正在进行的用户消息活动的高度并行计算机上的用户应用的情况下，也可以获得多个网络节点的检查点数据。在一个实施例中，非易失性存储器是可插拔闪存卡。

8.

发明授权
Non-volatile memory for checkpoint storage 失效
标题翻译：用于检查点存储的非易失性存储器

公开(公告)号：US08788879B2

公开(公告)日：2014-07-22

申请号：US13004005

申请日：2011-01-10

申请人： Matthias A. Blumrich , Dong Chen , Thomas M. Cipolla , Paul W. Coteus , Alan Gara , Philip Heidelberger , Mark J. Jeanson , Gerard V. Kopcsay , Martin Ohmacht , Todd E. Takken

发明人： Matthias A. Blumrich , Dong Chen , Thomas M. Cipolla , Paul W. Coteus , Alan Gara , Philip Heidelberger , Mark J. Jeanson , Gerard V. Kopcsay , Martin Ohmacht , Todd E. Takken

IPC分类号： G06F11/00

CPC分类号： G06F11/1438 , G06F2201/82 , G06F2201/84

摘要： A system, method and computer program product for supporting system initiated checkpoints in high performance parallel computing systems and storing of checkpoint data to a non-volatile memory storage device. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity. In one embodiment, the non-volatile memory is a pluggable flash memory card.

摘要翻译： 一种用于在高性能并行计算系统中支持系统发起的检查点并将检查点数据存储到非易失性存储器存储设备的系统，方法和计算机程序产品。系统和方法产生选择性控制信号，以在存在与在节点处运行的用户应用相关联的消息传送活动的情况下执行系统相关数据的检查点。检查点由系统启动，使得即使在存在包括正在进行的用户消息活动的高度并行计算机上的用户应用的情况下，也可以获得多个网络节点的检查点数据。在一个实施例中，非易失性存储器是可插拔闪存卡。

9.

发明申请
ULTRASCALABLE PETAFLOP PARALLEL SUPERCOMPUTER 失效
标题翻译：超声波PETAFLOP并行超级计算机

公开(公告)号：US20090006808A1

公开(公告)日：2009-01-01

申请号：US11768905

申请日：2007-06-26

申请人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken

发明人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken

IPC分类号： G06F15/80 , G06F9/06

CPC分类号： G06F15/17337

摘要： A novel massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. Novel use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

摘要翻译： petaOPS规模的一种新型大规模并行超级计算机包括基于片上系统技术的节点架构，其中每个处理节点包括具有多达四个处理元件的单个专用集成电路（ASIC）。 ASIC节点通过多个独立网络互连，以最小的延迟最大化节点之间的数据包通信的吞吐量。多个网络可以包括用于并行算法消息传递的三个高速网络，包括Torus，集合网络和提供全局障碍和通知功能的全球异步网络。这些多个独立网络可以根据用于优化算法处理性能的算法的需求或阶段来协同或独立地利用。提供了新的使用DMA引擎来促进节点之间的消息传递，而不需要节点处理资源。

10.

发明授权
Ultrascalable petaflop parallel supercomputer 失效
标题翻译：超平面petaflop平行超级计算机

公开(公告)号：US07761687B2

公开(公告)日：2010-07-20

申请号：US11768905

申请日：2007-06-26

申请人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken

发明人： Matthias A. Blumrich , Dong Chen , George Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Shawn Hall , Rudolf A. Haring , Philip Heidelberger , Gerard V. Kopcsay , Martin Ohmacht , Valentina Salapura , Krishnan Sugavanam , Todd Takken

IPC分类号： G06F15/173

CPC分类号： G06F15/17337

摘要： A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

摘要翻译： petaOPS规模的大规模并行超级计算机包括基于片上系统技术的节点架构，其中每个处理节点包括具有多达四个处理元件的单个专用集成电路（ASIC）。 ASIC节点通过多个独立网络互连，以最小的延迟最大化节点之间的数据包通信的吞吐量。多个网络可以包括用于并行算法消息传递的三个高速网络，包括Torus，集合网络和提供全局障碍和通知功能的全球异步网络。这些多个独立网络可以根据用于优化算法处理性能的算法的需求或阶段来协同或独立地利用。提供DMA引擎的使用以促进节点之间的消息传递，而不需要节点处理资源。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类