专利检索 ap:("Charles J. Archer" OR "Michael A. Blocksome" OR "Philip Heidelberger" OR "Sameer Kumar" OR "Jeffrey J. Parker" OR "Joseph D. Ratterman") AND inv:"Philip Heidelberger" 第 8 页

71.

发明授权
Synchronizing compute node time bases in a parallel computer 有权
标题翻译：在并行计算机中同步计算节点时基

公开(公告)号：US08924763B2

公开(公告)日：2014-12-30

申请号：US13327107

申请日：2011-12-15

申请人： Dong Chen , Daniel A. Faraj , Thomas M. Gooding , Philip Heidelberger

发明人： Dong Chen , Daniel A. Faraj , Thomas M. Gooding , Philip Heidelberger

IPC分类号： G06F1/12

CPC分类号： G06F1/12 , H04L12/413

摘要： Synchronizing time bases in a parallel computer that includes compute nodes organized for data communications in a tree network, where one compute node is designated as a root, and, for each compute node: calculating data transmission latency from the root to the compute node; configuring a thread as a pulse waiter; initializing a wakeup unit; and performing a local barrier operation; upon each node completing the local barrier operation, entering, by all compute nodes, a global barrier operation; upon all nodes entering the global barrier operation, sending, to all the compute nodes, a pulse signal; and for each compute node upon receiving the pulse signal: waking, by the wakeup unit, the pulse waiter; setting a time base for the compute node equal to the data transmission latency between the root node and the compute node; and exiting the global barrier operation.

摘要翻译： 在并行计算机中同步时基，其包括为树网络中的数据通信而组织的计算节点，其中一个计算节点被指定为根，并且对于每个计算节点，计算从根到计算节点的数据传输等待时间; 将线程配置为脉冲服务员; 初始化唤醒单元; 并执行局部屏障操作; 在每个节点完成局部屏障操作时，由所有计算节点进入全局屏障操作; 在所有节点进入全局屏障操作之后，向所有计算节点发送脉冲信号; 并且对于每个计算节点在接收到脉冲信号时：由唤醒单元唤醒脉冲服务员; 为计算节点设置等于根节点和计算节点之间的数据传输延迟的时基; 并退出全球屏障操作。

72.

发明授权
Optimizing TLB entries for mixed page size storage in contiguous memory 有权
标题翻译：优化连续内存中混合页大小存储的TLB条目

公开(公告)号：US08856490B2

公开(公告)日：2014-10-07

申请号：US13618730

申请日：2012-09-14

申请人： Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Jon K. Kriegel , Martin Ohmacht , Burkhard Steinmacher-Burow

发明人： Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Jon K. Kriegel , Martin Ohmacht , Burkhard Steinmacher-Burow

IPC分类号： G06F12/06 , G06F12/10

CPC分类号： G06F12/1027 , G06F2212/652 , G06F2212/654

摘要： A system and method for accessing memory are provided. The system comprises a lookup buffer for storing one or more page table entries, wherein each of the one or more page table entries comprises at least a virtual page number and a physical page number; a logic circuit for receiving a virtual address from said processor, said logic circuit for matching the virtual address to the virtual page number in one of the page table entries to select the physical page number in the same page table entry, said page table entry having one or more bits set to exclude a memory range from a page.

摘要翻译： 提供了一种访问存储器的系统和方法。该系统包括用于存储一个或多个页表条目的查找缓冲器，其中所述一个或多个页表条目中的每一个包括至少虚拟页码和物理页号; 用于从所述处理器接收虚拟地址的逻辑电路，所述逻辑电路用于将所述虚拟地址与所述页表项之一中的虚拟页号进行匹配，以选择所述同一页表项中的所述物理页号，所述页表项具有一个或多个位被设置为从页面排除存储器范围。

73.

发明授权
Method and apparatus for efficiently tracking queue entries relative to a timestamp 失效
标题翻译：相对于时间戳有效跟踪队列条目的方法和装置

公开(公告)号：US08756350B2

公开(公告)日：2014-06-17

申请号：US11768800

申请日：2007-06-26

申请人： Matthias A. Blumrich , Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Martin Ohmacht , Valentina Salapura , Pavlos Vranas

发明人： Matthias A. Blumrich , Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Martin Ohmacht , Valentina Salapura , Pavlos Vranas

IPC分类号： G06F3/00 , G06F5/00

CPC分类号： G06F12/0835 , G06F12/0831

摘要： An apparatus and method for tracking coherence event signals transmitted in a multiprocessor system. The apparatus comprises a coherence logic unit, each unit having a plurality of queue structures with each queue structure associated with a respective sender of event signals transmitted in the system. A timing circuit associated with a queue structure controls enqueuing and dequeuing of received coherence event signals, and, a counter tracks a number of coherence event signals remaining enqueued in the queue structure and dequeued since receipt of a timestamp signal. A counter mechanism generates an output signal indicating that all of the coherence event signals present in the queue structure at the time of receipt of the timestamp signal have been dequeued. In one embodiment, the timestamp signal is asserted at the start of a memory synchronization operation and, the output signal indicates that all coherence events present when the timestamp signal was asserted have completed. This signal can then be used as part of the completion condition for the memory synchronization operation.

摘要翻译： 一种用于跟踪在多处理器系统中发送的相干事件信号的装置和方法。该装置包括相干逻辑单元，每个单元具有多个队列结构，每个队列结构与在系统中传输的事件信号的相应发送者相关联。与队列结构相关联的定时电路控制接收的相干事件信号的排队和出队，并且计数器跟踪队列结构中剩余入队的多个相干事件信号，并且从接收到时间戳信号起出队。计数器机构产生一个输出信号，指示在接收时间戳信号时存在于队列结构中的所有相干事件信号已经出队。在一个实施例中，时间戳信号在存储器同步操作的开始被断言，并且输出信号指示当时间戳信号被断言时存在的所有相干事件已经完成。然后可以将该信号用作存储器同步操作的完成条件的一部分。

74.

发明申请
T-STAR INTERCONNECTION NETWORK TOPOLOGY 有权
标题翻译： T-STAR互联网络拓扑

公开(公告)号：US20140044006A1

公开(公告)日：2014-02-13

申请号：US13569789

申请日：2012-08-08

申请人： Dong Chen , Paul W. Coteus , Noel A. Eisley , Philip Heidelberger , Robert M. Senger , Yutaka Sugawara

发明人： Dong Chen , Paul W. Coteus , Noel A. Eisley , Philip Heidelberger , Robert M. Senger , Yutaka Sugawara

IPC分类号： H04L12/28

CPC分类号： H04L41/0663 , H04L41/12 , H04L41/145 , H04L45/04

摘要： According to one embodiment of the present invention, a system for network communication includes an M dimensional grid of node groups, each node group including N nodes, wherein M is greater than or equal to one and N is greater than one and each node comprises a router and intra-group links directly connecting each node in each node group to every other node in the node group. In addition, the system includes inter-group links directly connecting each node in each node group to a node in each neighboring node group in the M dimensional grid.

摘要翻译： 根据本发明的一个实施例，一种用于网络通信的系统包括节点组的M维网格，每个节点组包括N个节点，其中M大于或等于1，并且N大于1，并且每个节点包括路由器和组内链路，将每个节点组中的每个节点直接连接到节点组中的每个其他节点。此外，该系统包括将每个节点组中的每个节点直接连接到M维网格中的每个相邻节点组中的节点的组间链路。

75.

发明授权
Embedding global barrier and collective in torus network with each node combining input from receivers according to class map for output to senders 有权
标题翻译：在环网中嵌入全局屏障和集体，每个节点根据类映射将接收器的输入组合到输出到发送器

公开(公告)号：US08521990B2

公开(公告)日：2013-08-27

申请号：US12723277

申请日：2010-03-12

申请人： Dong Chen , Paul W. Coteus , Noel A. Eisley , Alan Gara , Philip Heidelberger , Robert M. Senger , Valentina Salapura , Burkhard Steinmacher-Burow , Yutaka Sugawara , Todd E. Takken

发明人： Dong Chen , Paul W. Coteus , Noel A. Eisley , Alan Gara , Philip Heidelberger , Robert M. Senger , Valentina Salapura , Burkhard Steinmacher-Burow , Yutaka Sugawara , Todd E. Takken

IPC分类号： G06F15/16

CPC分类号： G06F9/30021 , G06F9/3001 , G06F9/30018 , G06F9/30145 , G06F11/3024 , G06F11/3409 , G06F11/348 , G06F15/17362 , G06F15/17381 , G06F15/17393 , G06F2201/88 , H04L67/10

摘要： Embodiments of the invention provide a method, system and computer program product for embedding a global barrier and global interrupt network in a parallel computer system organized as a torus network. The computer system includes a multitude of nodes. In one embodiment, the method comprises taking inputs from a set of receivers of the nodes, dividing the inputs from the receivers into a plurality of classes, combining the inputs of each of the classes to obtain a result, and sending said result to a set of senders of the nodes. Embodiments of the invention provide a method, system and computer program product for embedding a collective network in a parallel computer system organized as a torus network. In one embodiment, the method comprises adding to a torus network a central collective logic to route messages among at least a group of nodes in a tree structure.

摘要翻译： 本发明的实施例提供了一种用于在被组织为环面网络的并行计算机系统中嵌入全局屏障和全局中断网络的方法，系统和计算机程序产品。计算机系统包括多个节点。在一个实施例中，该方法包括从节点的一组接收器中获取输入，将来自接收器的输入划分为多个类，组合每个类的输入以获得结果，并将所述结果发送到一组的节点的发送者。本发明的实施例提供了一种用于将集体网络嵌入组织为环面网络的并行计算机系统中的方法，系统和计算机程序产品。在一个实施例中，该方法包括向环形网络添加集中逻辑以在树结构中的至少一组节点之间路由消息。

76.

发明申请
Synchronizing Compute Node Time Bases In A Parallel Computer 有权
标题翻译：在并行计算机中同步计算节点时基

公开(公告)号：US20130159760A1

公开(公告)日：2013-06-20

申请号：US13327107

申请日：2011-12-15

申请人： Dong Chen , Daniel A. Faraj , Thomas M. Gooding , Philip Heidelberger

发明人： Dong Chen , Daniel A. Faraj , Thomas M. Gooding , Philip Heidelberger

IPC分类号： G06F1/12

CPC分类号： G06F1/12 , H04L12/413

摘要： Synchronizing time bases in a parallel computer that includes compute nodes organized for data communications in a tree network, where one compute node is designated as a root, and, for each compute node: calculating data transmission latency from the root to the compute node; configuring a thread as a pulse waiter; initializing a wakeup unit; and performing a local barrier operation; upon each node completing the local barrier operation, entering, by all compute nodes, a global barrier operation; upon all nodes entering the global barrier operation, sending, to all the compute nodes, a pulse signal; and for each compute node upon receiving the pulse signal: waking, by the wakeup unit, the pulse waiter; setting a time base for the compute node equal to the data transmission latency between the root node and the compute node; and exiting the global barrier operation.

摘要翻译： 在并行计算机中同步时基，其包括为树网络中的数据通信而组织的计算节点，其中一个计算节点被指定为根，并且对于每个计算节点，计算从根到计算节点的数据传输等待时间; 将线程配置为脉冲服务员; 初始化唤醒单元; 并执行局部屏障操作; 在每个节点完成局部屏障操作时，由所有计算节点进入全局屏障操作; 在所有节点进入全局屏障操作之后，向所有计算节点发送脉冲信号; 并且对于每个计算节点在接收到脉冲信号时：由唤醒单元唤醒脉冲服务员; 为计算节点设置等于根节点和计算节点之间的数据传输延迟的时基; 并退出全球屏障操作。

77.

发明授权
Combined group ECC protection and subgroup parity protection 有权
标题翻译：组合组ECC保护和子组奇偶校验保护

公开(公告)号：US08468416B2

公开(公告)日：2013-06-18

申请号：US11768527

申请日：2007-06-26

申请人： Alan G. Gara , Dong Chen , Philip Heidelberger , Martin Ohmacht

发明人： Alan G. Gara , Dong Chen , Philip Heidelberger , Martin Ohmacht

IPC分类号： H03M13/00

CPC分类号： G06F11/1076 , G06F11/1064 , G06F2212/403 , H03M1/0687 , H03M13/13 , H03M13/2707 , H03M13/271 , H03M13/29 , H03M13/616

摘要： A method and system are disclosed for providing combined error code protection and subgroup parity protection for a given group of n bits. The method comprises the steps of identifying a number, m, of redundant bits for said error protection; and constructing a matrix P, wherein multiplying said given group of n bits with P produces m redundant error correction code (ECC) protection bits, and two columns of P provide parity protection for subgroups of said given group of n bits. In the preferred embodiment of the invention, the matrix P is constructed by generating permutations of m bit wide vectors with three or more, but an odd number of, elements with value one and the other elements with value zero; and assigning said vectors to rows of the matrix P.

摘要翻译： 公开了用于为给定的n位组提供组合的错误代码保护和子组奇偶校验保护的方法和系统。该方法包括以下步骤：识别用于所述错误保护的冗余位的数量m; 并且构造矩阵P，其中将所述给定的n个比特组与P相乘产生m个冗余纠错码（ECC）保护比特，并且两列P为所述给定组n比特的子组提供奇偶校验保护。在本发明的优选实施例中，矩阵P是通过产生具有三个或更多个奇数个元素的m位宽向量的排列而构成的，其中值为1的元素和其他元素的值为零; 并将所述向量分配给矩阵P的行。

78.

发明授权
Massively parallel supercomputer 有权
标题翻译：大型并行超级计算机

公开(公告)号：US08250133B2

公开(公告)日：2012-08-21

申请号：US12492799

申请日：2009-06-26

申请人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken

发明人： Matthias A. Blumrich , Dong Chen , George L. Chiu , Thomas M. Cipolla , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Gerard V. Kopcsay , Lawrence S. Mok , Todd E. Takken

IPC分类号： G06F15/16

CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338

摘要： A novel massively parallel supercomputer of hundreds of teraOPS-scale includes node architectures based upon System- On-a-Chip technology, i.e., each processing node comprises a single Application Specific Integrated Circuit (ASIC). Within each ASIC node is a plurality of processing elements each of which consists of a central processing unit (CPU) and plurality of floating point processors to enable optimal balance of computational performance, packaging density, low cost, and power and cooling requirements. The plurality of processors within a single node individually or simultaneously work on any combination of computation or communication as required by the particular algorithm being solved. The system-on-a-chip ASIC nodes are interconnected by multiple independent networks that optimally maximizes packet communications throughput and minimizes latency. The multiple networks include three high-speed networks for parallel algorithm message passing including a Torus, Global Tree, and a Global Asynchronous network that provides global barrier and notification functions.

摘要翻译： 数百个teraOPS级别的新型大规模并行超级计算机包括基于片上系统技术的节点架构，即每个处理节点包括单个专用集成电路（ASIC）。在每个ASIC节点内是多个处理元件，每个处理元件由中央处理单元（CPU）和多个浮点处理器组成，以实现计算性能，封装密度，低成本以及功率和冷却要求的最佳平衡。单个节点内的多个处理器单独或同时工作在要解决的特定算法所要求的计算或通信的任何组合上。片上系统ASIC节点通过多个独立网络互连，从而最大限度地最大限度地提高了分组通信吞吐量并最大限度地减少了延迟。多个网络包括用于并行算法消息传递的三个高速网络，包括Torus，全局树和提供全局障碍和通知功能的全球异步网络。

79.

发明申请
COLLECTIVE NETWORK FOR COMPUTER STRUCTURES 有权
标题翻译：电脑结构的集体网络

公开(公告)号：US20110219280A1

公开(公告)日：2011-09-08

申请号：US13101566

申请日：2011-05-05

申请人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas

发明人： Matthias A. Blumrich , Paul W. Coteus , Dong Chen , Alan Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Todd E. Takken , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas

IPC分类号： H03M13/09 , H04L1/08 , G06F11/10 , G06F11/14

CPC分类号： H04L1/08 , G06F9/46 , G06F11/08 , G06F11/1423 , H03M13/09 , H04L1/0061 , H04L1/1607 , H04L1/1867 , H04L2001/0093 , H04L2001/0097

摘要： A system and method for enabling high-speed, low-latency global collective communications among interconnected processing nodes. The global collective network optimally enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the network via links to facilitate performance of low-latency global processing operations at nodes of the virtual network and class structures. The global collective network may be configured to provide global barrier and interrupt functionality in asynchronous or synchronized manner. When implemented in a massively-parallel supercomputing structure, the global collective network is physically and logically partitionable according to needs of a processing algorithm.

摘要翻译： 一种用于实现互连处理节点之间的高速，低延迟全局集体通信的系统和方法。全局集体网络最优地使得能够在具有多个互连处理节点的计算机结构中执行并行算法操作期间执行集体缩减操作。包括通过链路互连网络节点的路由器设备，以便于在虚拟网络和类结构的节点处执行低延迟全局处理操作。全局集体网络可以被配置为以异步或同步方式提供全局屏障和中断功能。当在大规模并行超级计算结构中实现时，全局集体网络根据处理算法的需要在物理上和逻辑上可分割。

80.

发明申请
MULTI-INPUT AND BINARY REPRODUCIBLE, HIGH BANDWIDTH FLOATING POINT ADDER IN A COLLECTIVE NETWORK 有权
标题翻译：多输入和二进制可复现，集合网络中的高带宽浮点添加

公开(公告)号：US20110173421A1

公开(公告)日：2011-07-14

申请号：US12684776

申请日：2010-01-08

申请人： Dong Chen , Noel A. Eisley , Philip Heidelberger , Burkhard Steinmacher-Burow

发明人： Dong Chen , Noel A. Eisley , Philip Heidelberger , Burkhard Steinmacher-Burow

IPC分类号： G06F9/302

CPC分类号： G06F7/38 , G06F7/485 , G06F9/30014 , G06F9/30025 , G06F9/3885 , G06F2207/3808

摘要： To add floating point numbers in a parallel computing system, a collective logic device receives the floating point numbers from computing nodes. The collective logic devices converts the floating point numbers to integer numbers. The collective logic device adds the integer numbers and generating a summation of the integer numbers. The collective logic device converts the summation to a floating point number. The collective logic device performs the receiving, the converting the floating point numbers, the adding, the generating and the converting the summation in one pass. One pass indicates that the computing nodes send inputs only once to the collective logic device and receive outputs only once from the collective logic device.

摘要翻译： 为了在并行计算系统中添加浮点数，集体逻辑器件从计算节点接收浮点数。集体逻辑器件将浮点数转换为整数。集体逻辑器件添加整数并产生整数的求和。集体逻辑设备将求和转换为浮点数。集体逻辑设备执行接收，转换浮点数，加法，生成和一次转换求和。一次通过表示计算节点仅向集体逻辑设备发送一次输入，并从集体逻辑设备接收一次输出。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类