专利检索 ap:("Dong Chen" OR "Alana Gara" OR "Philip Heidelberger" OR "Sameer Kumar" OR "Martin Ohmacht" OR "Burkhard Steinmacher-Burow" OR "Robert Wisniewski") AND inv:"Dong Chen" 第 4 页

31.

发明授权
Remote processing and memory utilization 有权
标题翻译：远程处理和内存利用

公开(公告)号：US09037669B2

公开(公告)日：2015-05-19

申请号：US13570916

申请日：2012-08-09

申请人： Dong Chen , Noel A. Eisley , Philip Heidelberger , James A. Kahle , Fabrizio Petrini , Robert M. Senger , Burkhard Steinmacher-Burow , Yutaka Sugawara

发明人： Dong Chen , Noel A. Eisley , Philip Heidelberger , James A. Kahle , Fabrizio Petrini , Robert M. Senger , Burkhard Steinmacher-Burow , Yutaka Sugawara

IPC分类号： G06F15/167 , G06F15/173 , G06F11/00 , G06F9/54 , H04L29/06

CPC分类号： G06F9/547 , H04L29/0617 , H04L67/40

摘要： According to one embodiment of the present invention, a system for operating memory includes a first node coupled to a second node by a network, the system configured to perform a method including receiving the remote transaction message from the second node in a processing element in the first node via the network, wherein the remote transaction message bypasses a main processor in the first node as it is transmitted to the processing element. In addition, the method includes accessing, by the processing element, data from a location in a memory in the first node based on the remote transaction message, and performing, by the processing element, computations based on the data and the remote transaction message.

摘要翻译： 根据本发明的一个实施例，一种用于操作存储器的系统包括由网络耦合到第二节点的第一节点，所述系统被配置为执行一种方法，该方法包括从所述第二节点接收来自所述第二节点的处理元件中的所述远程事务消息第一节点经由网络，其中当所述远程事务消息被传送到所述处理元件时，所述远程事务消息绕过所述第一节点中的主处理器。此外，该方法包括基于远程事务消息，由处理元件访问来自第一节点中的存储器中的位置的数据，以及由处理元件基于数据和远程事务消息执行计算。

32.

发明授权
Multi-input and binary reproducible, high bandwidth floating point adder in a collective network 有权
标题翻译：集成网络中的多输入和二进制可重复的高带宽浮点加法器

公开(公告)号：US08977669B2

公开(公告)日：2015-03-10

申请号：US12684776

申请日：2010-01-08

申请人： Dong Chen , Noel A. Eisley , Philip Heidelberger , Burkhard Steinmacher-Burow

发明人： Dong Chen , Noel A. Eisley , Philip Heidelberger , Burkhard Steinmacher-Burow

IPC分类号： G06F7/38 , G06F9/30 , G06F9/38

CPC分类号： G06F7/38 , G06F7/485 , G06F9/30014 , G06F9/30025 , G06F9/3885 , G06F2207/3808

摘要： To add floating point numbers in a parallel computing system, a collective logic device receives the floating point numbers from computing nodes. The collective logic devices converts the floating point numbers to integer numbers. The collective logic device adds the integer numbers and generating a summation of the integer numbers. The collective logic device converts the summation to a floating point number. The collective logic device performs the receiving, the converting the floating point numbers, the adding, the generating and the converting the summation in one pass. One pass indicates that the computing nodes send inputs only once to the collective logic device and receive outputs only once from the collective logic device.

摘要翻译： 为了在并行计算系统中添加浮点数，集体逻辑器件从计算节点接收浮点数。集体逻辑器件将浮点数转换为整数。集体逻辑器件添加整数并产生整数的求和。集体逻辑设备将求和转换为浮点数。集体逻辑设备执行接收，转换浮点数，加法，生成和一次转换求和。一次通过表示计算节点仅向集体逻辑设备发送一次输入，并从集体逻辑设备接收一次输出。

33.

发明申请
CACHE AS POINT OF COHERENCE IN MULTIPROCESSOR SYSTEM 有权
标题翻译： CACHE作为多处理器系统中的一致性点

公开(公告)号：US20110219188A1

公开(公告)日：2011-09-08

申请号：US13008531

申请日：2011-01-18

申请人： Matthias A. Blumrich , Luis H. Ceze , Dong Chen , Alan Gara , Philip Heidelberger , Martin Ohmarcht , Burkhard Steinmacher-Burow , Zhuang Xiaotong

发明人： Matthias A. Blumrich , Luis H. Ceze , Dong Chen , Alan Gara , Philip Heidelberger , Martin Ohmarcht , Burkhard Steinmacher-Burow , Zhuang Xiaotong

IPC分类号： G06F12/08

CPC分类号： G06F9/524 , G06F12/08

摘要： In a multiprocessor system, a conflict checking mechanism is implemented in the L2 cache memory. Different versions of speculative writes are maintained in different ways of the cache. A record of speculative writes is maintained in the cache directory. Conflict checking occurs as part of directory lookup. Speculative versions that do not conflict are aggregated into an aggregated version in a different way of the cache. Speculative memory access requests do not go to main memory.

摘要翻译： 在多处理器系统中，在L2高速缓冲存储器中实现冲突检查机制。不同版本的推测性写入以不同的方式保存在缓存中。高速缓存目录中保留了推测性写入记录。冲突检查作为目录查找的一部分发生。不冲突的推测版本以不同的缓存方式聚合成聚合版本。推测内存访问请求不会转到主内存。

34.

发明授权
DMA engine for repeating communication patterns 失效
标题翻译：用于重复通信模式的DMA引擎

公开(公告)号：US07802025B2

公开(公告)日：2010-09-21

申请号：US11768795

申请日：2007-06-26

申请人： Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard Steinmacher-Burow , Pavlos Vranas

发明人： Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard Steinmacher-Burow , Pavlos Vranas

IPC分类号： G06F13/28

CPC分类号： G06F15/163

摘要： A parallel computer system is constructed as a network of interconnected compute nodes to operate a global message-passing application for performing communications across the network. Each of the compute nodes includes one or more individual processors with memories which run local instances of the global message-passing application operating at each compute node to carry out local processing operations independent of processing operations carried out at other compute nodes. Each compute node also includes a DMA engine constructed to interact with the application via Injection FIFO Metadata describing multiple Injection FIFOs where each Injection FIFO may containing an arbitrary number of message descriptors in order to process messages with a fixed processing overhead irrespective of the number of message descriptors included in the Injection FIFO.

摘要翻译： 并行计算机系统被构造为互连的计算节点的网络，以操作用于在整个网络上执行通信的全局消息传递应用。每个计算节点包括具有存储器的一个或多个单独处理器，该存储器运行在每个计算节点处操作的全局消息传递应用的本地实例，以独立于在其他计算节点执行的处理操作来执行本地处理操作。每个计算节点还包括构造成通过描述多个注入FIFO的注入FIFO元数据与应用交互的DMA引擎，其中每个注入FIFO可以包含任意数量的消息描述符，以便处理具有固定处理开销的消息，而不管消息的数量描述符包含在注入FIFO中。

35.

发明申请
Method and apparatus for re-utilizing partially failed resources as network resources 失效
标题翻译：将部分故障资源重新利用作为网络资源的方法和装置

公开(公告)号：US20070168695A1

公开(公告)日：2007-07-19

申请号：US11335784

申请日：2006-01-19

申请人： Dong Chen , Alan Gara , Philip Heidelberger , Thomas Liebsch , Burkhard Steinmacher-Burow , Pavlos Vranas

发明人： Dong Chen , Alan Gara , Philip Heidelberger , Thomas Liebsch , Burkhard Steinmacher-Burow , Pavlos Vranas

IPC分类号： G06F11/00

CPC分类号： G06F11/0793 , G06F11/0724

摘要： A method and apparatus for re-utilizing partially failed compute resources in a massively parallel super computer system. In the preferred embodiments the compute node comprises a number of clock domains that can be enabled separately. When an error in a compute node is detected, and the failure is not in network communication blocks, a clock enable circuit enables the clocks to the network communication blocks only to allow the partially failed compute node to be re-utilized as a network resource. The computer system can then continue to operate with only slightly diminished performance and thereby improve performance and perceived overall reliability.

摘要翻译： 在大规模并行的超级计算机系统中重新利用部分失败的计算资源的方法和装置。在优选实施例中，计算节点包括可以单独使能的多个时钟域。当检测到计算节点中的错误，并且故障不在网络通信块中时，时钟使能电路仅允许网络通信块的时钟允许部分失败的计算节点被重新利用为网络资源。然后，计算机系统可以继续操作，性能略有降低，从而提高性能和可察觉的整体可靠性。

36.

发明申请
Multidimensional switch network 失效
标题翻译：多维交换机网络

公开(公告)号：US20050195808A1

公开(公告)日：2005-09-08

申请号：US10793068

申请日：2004-03-04

申请人： Dong Chen , Alan Gara , Mark Giampapa , Philip Heidelberger , Dirk Hoenicke , Burkhard Steinmacher-Burow , Pavlos Vranas , Matthias Blumrich

发明人： Dong Chen , Alan Gara , Mark Giampapa , Philip Heidelberger , Dirk Hoenicke , Burkhard Steinmacher-Burow , Pavlos Vranas , Matthias Blumrich

IPC分类号： H04L12/26

CPC分类号： H04L49/1576 , H04L45/06

摘要： Multidimensional switch data networks are disclosed, such as are used by a distributed-memory parallel computer, as applied for example to computations in the field of life sciences. A distributed memory parallel computing system comprises a number of parallel compute nodes and a message passing data network connecting the compute nodes together. The data network connecting the compute nodes comprises a multidimensional switch data network of compute nodes having N dimensions, and a number/array of compute nodes Ln in each of the N dimensions. Each compute node includes an N port routing element having a port for each of the N dimensions. Each compute node of an array of Ln compute nodes in each of the N dimensions connects through a port of its routing element to an Ln port crossbar switch having Ln ports. Several embodiments are disclosed of a 4 dimensional computing system having 65,536 compute nodes.

摘要翻译： 公开了多维交换机数据网络，例如由分布式存储器并行计算机使用的，例如应用于生命科学领域的计算。分布式存储器并行计算系统包括多个并行计算节点和将计算节点连接在一起的消息传递数据网络。连接计算节点的数据网络包括具有N维的计算节点的多维交换机数据网络和N个维度中的每一个中的计算节点Ln的数量/数组。每个计算节点包括具有用于N个维度中的每一个的端口的N端口路由元件。每个N维中的Ln计算节点阵列的每个计算节点通过其路由元素的端口连接到具有Ln端口的Ln端口交叉开关。公开了具有65,536个计算节点的四维计算系统的几个实施例。

37.

发明申请
REMOTE PROCESSING AND MEMORY UTILIZATION 审中-公开
标题翻译：远程处理和存储器的使用

公开(公告)号：US20130290473A1

公开(公告)日：2013-10-31

申请号：US13584323

申请日：2012-08-13

申请人： Dong Chen , Noel A. Eisley , Philip Heidelberger , James A. Kahle , Fabrizio Petrini , Robert M. Senger , Burkhard Steinmacher-Burow , Yutaka Sugawara

发明人： Dong Chen , Noel A. Eisley , Philip Heidelberger , James A. Kahle , Fabrizio Petrini , Robert M. Senger , Burkhard Steinmacher-Burow , Yutaka Sugawara

IPC分类号： G06F15/167

CPC分类号： G06F15/17331 , G06F9/544 , H04L29/0617 , H04L67/1097

摘要： According to one embodiment of the present invention, a system for operating memory includes a first node coupled to a second node by a network, the system configured to perform a method including receiving the remote transaction message from the second node in a processing element in the first node via the network, wherein the remote transaction message bypasses a main processor in the first node as it is transmitted to the processing element. In addition, the method includes accessing, by the processing element, data from a location in a memory in the first node based on the remote transaction message, and performing, by the processing element, computations based on the data and the remote transaction message.

摘要翻译： 根据本发明的一个实施例，一种用于操作存储器的系统包括由网络耦合到第二节点的第一节点，所述系统被配置为执行一种方法，该方法包括从所述第二节点接收来自所述第二节点的处理元件中的所述远程事务消息第一节点经由网络，其中当所述远程事务消息被传送到所述处理元件时，所述远程事务消息绕过所述第一节点中的主处理器。此外，该方法包括基于远程事务消息，由处理元件访问来自第一节点中的存储器中的位置的数据，以及由处理元件基于数据和远程事务消息执行计算。

38.

发明授权
Deadlock-free class routes for collective communications embedded in a multi-dimensional torus network 失效
标题翻译：嵌套在多维环面网络中的集体通信的无死锁级路由

公开(公告)号：US08364844B2

公开(公告)日：2013-01-29

申请号：US12697015

申请日：2010-01-29

申请人： Dong Chen , Noel A. Eisley , Burkhard Steinmacher-Burow , Philip Heidelberger

发明人： Dong Chen , Noel A. Eisley , Burkhard Steinmacher-Burow , Philip Heidelberger

IPC分类号： G06F15/173

CPC分类号： G06F15/17381 , G06F9/30072

摘要： A computer implemented method and a system for routing data packets in a multi-dimensional computer network. The method comprises routing a data packet among nodes along one dimension towards a root node, each node having input and output communication links, said root node not having any outgoing uplinks, and determining at each node if the data packet has reached a predefined coordinate for the dimension or an edge of the subrectangle for the dimension, and if the data packet has reached the predefined coordinate for the dimension or the edge of the subrectangle for the dimension, determining if the data packet has reached the root node, and if the data packet has not reached the root node, routing the data packet among nodes along another dimension towards the root node.

摘要翻译： 一种用于在多维计算机网络中路由数据分组的计算机实现的方法和系统。该方法包括沿着一个维度的节点之间的数据分组路由到根节点，每个节点具有输入和输出通信链路，所述根节点不具有任何输出上行链路，并且在每个节点处确定数据分组是否已经达到预定义的坐标尺寸或子尺寸的边缘，以及如果数据分组已达到尺寸的维度或边缘的尺寸的预定义坐标，则确定数据分组是否已到达根节点，并且如果数据分组数据包尚未到达根节点，将数据包沿着另一个维度的节点路由到根节点。

39.

发明授权
Support for non-locking parallel reception of packets belonging to a single memory reception FIFO 有权
标题翻译：支持非锁定并行接收属于单个存储器接收FIFO的数据包

公开(公告)号：US08086766B2

公开(公告)日：2011-12-27

申请号：US12688747

申请日：2010-01-15

申请人： Dong Chen , Philip Heidelberger , Valentina Salapura , Robert M. Senger , Burkhard Steinmacher-Burow , Yutaka Sugawara

发明人： Dong Chen , Philip Heidelberger , Valentina Salapura , Robert M. Senger , Burkhard Steinmacher-Burow , Yutaka Sugawara

IPC分类号： G06F13/28

CPC分类号： G06F13/28

摘要： A method and apparatus for distributed parallel messaging in a parallel computing system. A plurality of DMA engine units are configured in a multiprocessor system to operate in parallel, one DMA engine unit for transferring a current packet received at a network reception queue to a memory location in a memory FIFO (rmFIFO) region of a memory. A control unit implements logic to determine whether any prior received packet destined for that rmFIFO is still in a process of being stored in the associated memory by another DMA engine unit of the plurality, and prevent the one DMA engine unit from indicating completion of storing the current received packet in the reception memory FIFO (rmFIFO) until all prior received packets destined for that rmFIFO are completely stored by the other DMA engine units. Thus, there is provided non-locking support so that multiple packets destined for a single rmFIFO are transferred and stored in parallel to predetermined locations in a memory.

摘要翻译： 一种并行计算系统中分布式并行消息传递的方法和装置。多个DMA引擎单元被配置在多处理器系统中以并行操作，一个DMA引擎单元用于将在网络接收队列处接收的当前分组传送到存储器的存储器FIFO（rmFIFO）区域中的存储单元。控制单元实现逻辑以确定目的地为该rmFIFO的任何先前接收到的分组是否仍处于由多个的另一DMA引擎单元存储在相关联的存储器中的过程中，并且防止一个DMA引擎单元指示完成存储在接收存储器FIFO（rmFIFO）中的当前接收的分组直到所有先前接收到的该rmFIFO的分组被其它DMA引擎单元完全存储。因此，提供了非锁定支持，使得去往单个rmFIFO的多个分组被传送并存储在存储器中的预定位置。

40.

发明申请
ATOMICITY: A MULTI-PRONGED APPROACH 审中-公开
标题翻译：原理：多方面的方法

公开(公告)号：US20110219215A1

公开(公告)日：2011-09-08

申请号：US13008546

申请日：2011-01-18

申请人： Matthias A. Blumrich , Dong Chen , Alan Gara , Philip Heidelberger , Martin Ohmarcht , Burkhard Steinmacher-Burow

发明人： Matthias A. Blumrich , Dong Chen , Alan Gara , Philip Heidelberger , Martin Ohmarcht , Burkhard Steinmacher-Burow

IPC分类号： G06F9/30

CPC分类号： G06F9/524 , G06F12/08

摘要： In a multiprocessor system with speculative execution, atomicity can be approached in several fashions. One approach is to have atomic instructions that achieve multiple functions and are guaranteed to complete. Another approach is to have blocks of code that are grouped to succeed or fail together. A system can incorporate more than one such approach. In implementing more than one approach, the system may prioritize one over another. When conflict detection is done through a directory lookup in cache memory, atomic instructions and atomicity related operations may be implemented in a cache data array access pipeline in that cache memory. This implementation may include feedback to the pipeline for implementing multiple functions within an atomic instruction and also for cascading atomic instructions.

摘要翻译： 在具有推测性执行的多处理器系统中，可以以几种方式逼近原子性。一种方法是具有实现多种功能并保证完成的原子指令。另一种方法是将代码块分组成一起成功或失败。系统可以包含多种这样的方法。在实施多种方法时，系统可以优先考虑其他方法。当通过高速缓冲存储器中的目录查找完成冲突检测时，原子指令和原子性相关操作可以在该高速缓冲存储器中的高速缓存数据阵列访问流水线中实现。该实现可以包括用于在原子指令内实现多个功能并且还用于级联原子指令的流水线的反馈。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类